Talk:Gene cluster

Article Ideas
We are hoping to expand upon this article to include more information regarding current research on gene clusters, some relevant figures and images, and some more high quality references on this topic. We would also like to add some specific examples of gene clusters to the page to help demonstrate this concept. Mnemcek (talk) 22:21, 10 March 2014 (UTC)
 * I think it would also be beneficial to link this page to other, more robust Wikipedia articles relating to relevant genetics topics. Mnemcek (talk) 01:26, 11 March 2014 (UTC)
 * Specifically, we are planning to expand upon the definition of gene clusters using relevant figures and images. We also plan to incorporate the use of Bioinformatics to identify gene clusters as well as discuss gene cluster importance in research. Kneal0627 (talk) 02:22, 11 March 2014 (UTC)
 * Who do you both mean by "we"? Maproom (talk) 08:19, 11 March 2014 (UTC)
 * Users Kneal0627 and Mnemcek will be working together on this article for a molecular biology course assignment. We are always open to suggestions and collaboration, if you are interested in improving this article. Mnemcek (talk) 16:58, 11 March 2014 (UTC)
 * Thank you, that is reassuring. I had feared, from your use of "we", that you might represent a team of researchers, with a viewpoint to push.
 * What I know of gene clusters is from Ohno's book, and I found it fascinating. But it is over 40 years since I last read it, so I can't contribute anything myself. I look forward to learning more, plenty must have been learned since then. Here are some questions, which your additions to the article might answer:
 * What is the oldest known gene cluster?
 * What is the largest known gene cluster? Maybe different answers for functional genes and for all recognisable genes.
 * I assume, or recall from Ohno, that the components of a cluster start by being adjacent, and then get spread across the genome by inversions and translocations. Is that right? Can you approximately date a cluster but how much it is spread across the chromosomes?
 * From my reading on the subject, this might be difficult to do because tandem arrayed gene clusters and dispersed gene clusters can interconvert between the two forms over time. If I come across any dating techniques using gene clusters, I will certainly add this to the article. Mnemcek (talk) 18:00, 8 April 2014 (UTC)
 * The sentence "Hemoglobin molecules contain any two identical proteins from this gene cluster..." makes little sense to me. Perhaps you could clarify it?
 * Do all sequenced species show gene clustering? Well – I know viruses don't, and I suspect bacteria don't. So what is the smallest genome that has been found to show clustering?
 * Bacteria have genes organized into operons which are a type of gene cluster. We are working on adding this to the article in the "Types of Gene Clusters" section. Mnemcek (talk) 18:00, 8 April 2014 (UTC)
 * Maproom (talk) 20:53, 11 March 2014 (UTC)

P.S. I wonder if you have looked at Human genetic clustering? It is a fork of this article, created on June 14th 2009. Maproom (talk) 20:58, 11 March 2014 (UTC)


 * Thank you for your comments and concerns about the article. We will take these into consideration as we continue our study on gene clusters and work to improve upon the article. Clusters occur as a result of duplication. As far as starting out as adjacent then spreading across the genome, I am unsure at this time. We will further investigate this theory. Bacteria actually do contain gene clusters in the form of coregulated genes aka operons. Kneal0627 (talk) 12:38, 1 April 2014 (UTC)

Comments from Richarnj
Overall your article is off to a great start. Each section is well written, and I think understandable for the average reader. The content also comes across with a neutral point of view with no original research, and has a much higher degree of verifiability with the addition of sources and inline citations.

The lead section is descriptive and concise, although would recommend rewording or explaining "products." Some readers may not know what genes encode so this description may be slightly too general. The content of your subsections are also descriptive and represent the subheaders well; however, I question if all of your current subheaders should be under "Creation of Gene Clusters." For example, "Types of Gene Clusters" could be its own header if expanded upon. This could even be placed immediately after the lead section with a separate subheader for each type of gene cluster. "Gene clusters vs. tandem repeats" may make more sense outside of the "Types of Gene Clusters" header.

To further expand your content consider addressing How were gene clusters discovered? What experiments or observations were made? Perhaps include a short history of the research of gene clusters. This could give the reader a better overview of the entire subject rather than just a description of the gene clusters themselves. Images may also help illustrate to the reader what a gene cluster looks like. I know it is likely difficult to find but it would be really helpful, especially in describing the difference of gene clusters and tandem repeats.

Additions wiki-linking may so help a less knowledgeable reader. Consider gene, evolution, eukaryote, protein, chromosome, and tandem repeat.
 * I agree with you here. This article has a lot of complex topics that some readers may not understand, but it is not really the place to go into detail about them. So, we will definitely add wiki-links to help guide understanding of the topics. Mnemcek (talk) 02:22, 8 April 2014 (UTC)

I will give you some additional feedback once I have a chance to read through your sources. Richarnj (talk) 13:27, 28 March 2014 (UTC)


 * I think you have some great references to start from. You have provided information from your sources without getting too close to the orginal wording and have also made this information less technical which will help the typical wiki audience.  I focused mostly on your first reference since you use it most often and for good reason.  I would recommend using it even more, if not directly then to gather ideas of new areas to research.  While I was reading it I was thinking you could add a discussion of the difference between gene clusters in prokaryotes vs. eukaryotes.  You could also discuss how gene clusters are found in a genome and the affects gene clusters have compared with the traditional randomly located genes.  You may also want to expand on your discussion of the evolutionary aspects of gene clusters. Richarnj (talk) 13:04, 31 March 2014 (UTC)
 * Thank you so much for your comments and feedback. We will certainly take these in consideration over the course of our next article contribution as well as throughout the remainder of the project. Kneal0627 (talk) 12:12, 1 April 2014 (UTC)
 * Thank you for your feedback. We have taken several of your recommendations into consideration. The subheaders under the section Creation of Gene clusters were consolidated. Figures were also added in this section and the content was greatly expanded. We are planning on adding a research section pertaining to gene clusters. While the history of experiments that lead to the discovery of gene clusters is difficult to find, we intend to focus on how gene clusters can/are being used in current research (i.e. treatment of diseases). We have also added several wiki links throughout the article that will hopefully give readers a better understanding of the material without going into so much detail in the article itself. Kneal0627 (talk) 23:44, 8 April 2014 (UTC)


 * The changes you have made look great, and it is really helpful to have the addition of the figures and wikilinks. A suggestion for working with wikilinks, if you want to link to a page but not use its exact wording you can use the pipe symbol | within the double square brackets.  Whatever is before the | is the article to which you are linking and after the | is the text you want displayed to the reader.  For example, if phenotype | phenotypic were in square brackets like a wikilink, phenotypic would appear to the reader and the link would redirect to the phenotype page.  This could be helpful when you have instances of (biology) in the title of a wikipage.  You may also consider shortening the third header to "Types of gene clusters" and using subheadings for prokaryotes and eukaryotes.  Can't wait to see how you continue to improve the article in the coming weeks. Richarnj (talk) 12:07, 10 April 2014 (UTC)
 * Thank you so much for the suggestion with wikilinks. We will definitely use this throughout the article! I think we could shorten the heading and use subheaders. Thanks! Kneal0627 (talk) 21:25, 14 April 2014 (UTC)

The article is now hard to understand
This article used to be short but easy to understand. Recent additions make it impossible to understand. Some examples:

From the first sentence: "closer to one another than anticipated". Anticipated by whom? How close did this person anticipate that such genes would be?

From the second sentence: "varieties of gene clusters". What is a "variety" of a gene cluster?

From the third sentence: "to identify ethnic groups within Homo sapiens". Are there groups within Homo sapiens that differ in their gene clusters? This seems most unlikely, and if true should be supported by a reference.

The fourth sentence "The presence of gene clusters suggests that a cluster provides an evolutionary advantage for the organism". No, it suggests that the genes with the cluster derive from a common ancestor, by duplication and divergence, as described by Ohno. Maproom (talk) 21:48, 29 March 2014 (UTC)


 * I have deleted the second and third sentences. I realised that their writer had confused the concepts of "gene cluster" and "genetic clustering". Maproom (talk) 08:41, 30 March 2014 (UTC)


 * Thank you for expressing your concerns and specifying examples of confusion. We will take these into consideration as we continue to improve upon the article. Addressing your concern for the sentence "The presence of gene clusters suggests that a cluster provides an evolutionary advantage for the organism." Yes, genes within a cluster are indicative that they arise from a common ancestor; however, genes within a cluster encode for the same function or variations of that function. Also, gene clusters aid in the horizontal transfer of the complete gene cluster from one species to another. This has become evident through various research and will be further clarified. Kneal0627 (talk) 12:31, 1 April 2014 (UTC)
 * Because this article is part of a semester project and we are graded on our contributions, we ask that you please refrain from making deletions. Please express your concerns on the article talk page and allow us ample time to respond. If you feel the need to make a deletion, please ensure that the reference is retained if used throughout the remainder of the article. Thanks! Kneal0627 (talk) 12:40, 1 April 2014 (UTC)
 * I am here to try to improve Wikipedia, not to help you achieve a good grade for your project. However, I can probably hold off making further edits until your project has been graded – when will that be?
 * Meanwhile, I have a suggestion for how you might improve the article. Start by defining "gene cluster"; then provide information on what is observed about gene clusters; and finally report on knowledge and speculation on how gene clusters are generated and maintained. At present, this material is all there in the article, but in a rather jumbled order. Maproom (talk) 14:31, 1 April 2014 (UTC)
 * Thank you. We simply ask that you please consider that we are continually working on improving this article as well. The citation that you deleted was correct and did mention coinheritance. I have replaced it. Please take a careful look at the fourth paragraph under the Introduction section of the reference. It states, 'Alternatively, coinheritance may provide the motive force for driving the clustering of genes.' Our project will be fully graded by May 6. Thank you for your suggestions. We will take these into consideration as we work to make our corrections/edits for this week. Kneal0627 (talk) 20:36, 1 April 2014 (UTC)
 * I apologise. I now realise that what I read was merely the abstract of the cited paper, which does not include the word "coinheritance". However, I believe that if you are to use the word "coinheritance" in the article, you should provide an explanation of its meaning. The paper cited is behind a paywall, and Google yields nothing clearly relevant, so it is not a term you can expect readers here to understand. Maproom (talk) 21:04, 1 April 2014 (UTC)

Comments from mmehta10
Great Start! This topic seems extremely interesting and will probably have great viewership on Wikipedia. I would like to provide some feedback so the learning community can benefit more from this article. I like that you have followed most of the criterion of a good article on Wikipedia platform such as it has been written with a neutral point of view and broadly covers the main aspects of the topic.

I did notice a few points which you may want to improve, one of which is perhaps adding some references for the information under "creation of gene clusters". Also, adding some images along-side the text will surely be appreciated by our visual learners as the readers can grasp the concepts better. Additional detail regarding how these gene clusters are being identified and used currently can also help in making the article more complete. I really like that you have added Wiki-links in order to give readers access to expand their knowledge of this topic easily. The references have been accurately cited and are all from reliable sources.Mmehta10 (talk) 03:17, 31 March 2014 (UTC)
 * Thank your for your feedback and suggestions for the article. We will take these into considerations as we work to improve upon the content. Initially, there was a reference for the information under "Creation of gene clusters"; however, a deletion in another area by a different user resulted in the loss of the citation. This will be corrected very soon. Kneal0627 (talk) 12:43, 1 April 2014 (UTC)
 * Thank you for your recommendations! Figures were added under the section "Creation of Gene Clusters." References were also corrected in this area, and the content was greatly expanded. We intend to add a "Research" section that will be aimed in the use of gene clusters in current and future research for the treatment of diseases; however, we are still gathering more information for this section. It will be added relatively soon.
 * Sorry, I forgot my signature. Kneal0627 (talk) 23:48, 8 April 2014 (UTC)
 * Hi all, the article seems to be shaping up quite well! Keep up the good work! One quick comment, some detail seems to be "missing" in reference 11: http://david.abcc.ncifcrf.gov/home.jsp. Missing or empty |title= (help). Perhaps this can be recited? Mmehta10 (talk) 01:02, 16 April 2014 (UTC)
 * I will try to improve the citation, thanks for the heads up. 71.237.32.30 (talk) 02:14, 23 April 2014 (UTC) Sorry, signed when I wasn't logged in!Mnemcek (talk) 02:17, 23 April 2014 (UTC)

Comments from Keilana
Hi guys, great work so far! I have some suggestions for you that I hope you'll find helpful. Please feel free to ask me via email or on my talk page if you have any questions about what I've suggested. Thank you!


 * The next thing you should be thinking about is expanding the lead.
 * Also, once you've expanded the lead, Wikipedia house style has it that we don't cite facts in the lead as long as they are discussed and cited elsewhere in the article - just something to keep in mind as you expand it.
 * As you've noted with the citation needed tags, the section on creation needs citations.
 * The writing in general could be simplified a little bit, and I also suggest you gloss complex concepts. One thing that can help with that is wikilinks to other articles that discuss the concepts in greater detail.
 * You don't need to repeat citations in each sentence, for example, in the second paragraph of the co-expression sections, you only need the Yi citation at the end.
 * Please paraphrase more carefully. The second paragraph of co-expression is far too closely paraphrased from the source, and that's just what I found from my spot check. Close paraphrasing qualifies as plagiarism.

Let me know if you have any questions. Keilana&#124;Parlez ici 15:55, 31 March 2014 (UTC)
 * Thank your for your feedback and suggestions for the article. We will take these into considerations as we work to improve upon the content. Initially, there was a reference for the information under "Creation of gene clusters"; however, a deletion in another area by a different user resulted in the loss of the citation. This will be corrected very soon. Thank you for your note regarding paraphrasing. I ran this through turnitin.com and it did not come up flagged; however, I will work to correct this. — Preceding unsigned comment added by Kneal0627 (talk • contribs) 12:49, 1 April 2014 (UTC)
 * Thank you for your recommendations! The lead has been expanded. Citations have been added under the section "Creation of Gene Clusters." Sections were rewritten in order to avoid close paraphrasing. The material under this section has also been further expanded in order to provide a better understanding for readers. Also, wiki links were added throughout the article to provide a means for the audience to understand concepts without going into further detail within the article itself. Once a wikilink has been used for one term, should we continue to use it each time the term is used throughout the article or just the initial one time? Kneal0627 (talk) 23:53, 8 April 2014 (UTC)

Comments from Maproom
I am doubtful of the sentence "Coordinated gene expression, or co-expression as a result of codominance, is considered to be the most common mechanism driving the formation of gene clusters; however, coinheritance has also been considered as a driving force for the formation of gene clusters."

I don't understand how codominance could provide a mechanism for the creation of gene clusters; though it could explain their evolutionary success, once created. I have been unable to find any explanation of the term "coinheritance", in Wikipedia or elsewhere. The source cited is a paper unavailable to me online; its abstract is about the identification of gene clusters, not their origin. Maproom (talk) 06:58, 4 April 2014 (UTC)

Also, there is something wrong with "Gene clusters may be similar to that of an operon.". What does "that" refer to? Maproom (talk) 07:03, 4 April 2014 (UTC)
 * The term "coinheritance" simply means joint inheritance. That is, the genes are inherited together. Google provides a definition as well as several other free journal articles. We will take all your concerns under consideration as we continue to elaborate on concepts and edit the article. Thanks! 96.36.136.218 (talk) 02:28, 5 April 2014 (UTC)

Comments from Tatabox8
Great job on your article it looks great! The article is well written and I can tell you have spent a great deal of time on it. The writing is informative and unbiased. The researchers that come upon this article will definitely have their questions answered and be informed on the subject matter. The wikilinks are great and will help readers expand on topics they are unfamiliar with. I agree with Richarnj on shortening the heading "Types of gene clusters: Prokaryotic gene clusters vs. Eukaryotic gene clusters" the title is simply too long and will look simple and straightforward if shorten and using subheaders within the section. I was informed by Keilana we don't have to cite the same source back to back sentences. I noticed in the creation of gene clusters source 7 was cited back to back. Also, looking at your reference list it looks like reference 4 and 5 are the exact same reference somehow listed twice. Reference number 11 is missing some information to be a complete source. Perhaps you're waiting on the website to get more information on how to cite the material. I like the addition of the images and the simple description under each one. I find that sometimes looking at the images gives me a good concept of the topic and then reading the material reinforces it. Keep up the good work!Tatabox8 (talk) 04:55, 11 April 2014 (UTC)
 * Thank you so much for all your suggestions. We will take this under consideration as we work to improve upon the article. Thank you for pointing out the references. I hadn't notice that but will correct it. Kneal0627 (talk) 21:29, 14 April 2014 (UTC)
 * Your article is coming along nicely. I see that you have fixed all the references to reflect your sources throughout the article. You have a great content outline, but have you thought about adding an image in the lead section somewhere? Perhaps hemoglobin? There is a impressive outline and an image would fill in some of that space. Your images all have good detailed caption. You have very detailed description of the individual sections, but if there are any images you can add that would add a nice touch. For example, I did a quick search on wikicommons and found several images on microarrays. Check out the link below. Information is great, but sometimes when you have an image it just reinforces what you read. I also notice that there is a disambigous link gene family in the see also section, you may want to to remove that since it doesn't actually lead to any article. Overall, great job guys your article has come a long way.Tatabox8 (talk) 02:48, 24 April 2014 (UTC)


 * 1) https://commons.wikimedia.org/w/index.php?search=microarray&title=Special%3ASearch&go=Go
 * Thank you for your comments! We have been searching for more relevant images to use throughout the article. Thank you for the link to this image! Kneal0627 (talk) 22:09, 25 April 2014 (UTC)
 * I removed the gene family link that was in the see also section. Thanks for the heads up! Kneal0627 (talk) 22:25, 25 April 2014 (UTC)
 * I like the images you have added to the article, the wiki-links and the new text are great.Tatabox8 (talk) 03:48, 5 May 2014 (UTC)
 * Wow, it looks like you added a great deal of text yesterday. Looking over the article I noticed that reference number 10 has an error "Cite error: The named reference Lawrence.26Rother was invoked but never defined (see the help page)." You may want to check it out and update it.Tatabox8 (talk) 05:12, 6 May 2014 (UTC)


 * The referencing error is caused by defining a reference name "Lawrence&Rothe", and then using the name "Lawrence&Rother". BTW, I found the Lawrence papers really interesting. I see that the "Fisher model" was proposed by Ronald Fisher in 1930, and is dismissed by Lawrence (and was dismissed by my teachers in 1972); I feel that it does not deserve so much space in the article. Maproom (talk) 06:52, 6 May 2014 (U
 * Thanks for catching the error! That was a typo. It has been corrected. I also found their papers interesting. I clarified that it was unlikely and dismissed, but I think it's important to include this model as it references how the reasoning for gene clusters came about.Kneal0627 (talk) 16:55, 6 May 2014 (UTC)

Comments from Klortho
The comments below are just my suggestions for a couple of ways it might be improved. Keep in mind that they are just suggestions, and if you disagree, or these conflict with other reviewers' suggestions, then do what you think is best.


 * The section beginning "Gene clusters may also be formed ..." is giving me a similarity match in Turnitin, but I don't have access to the original source book "Evolution by Gene Duplication". From what Turnitin is saying, though, the paraphrasing is too close.  Could you have another look and make sure that it is written in your own words?


 * You need to do some copy-editing for grammar. For example, in the lead paragraph, there are two number-disagreements: "protein" should be "proteins", and "Portions of the DNA ... is ...".  One trick is to copy-paste the article into MS Word or a similar word processor, and turn on its grammar checker.


 * In the link in your lead to "homology", code it this way, so that the reader doesn't see the "(biology)": " homology ".


 * In the lead, it is not immediately clear that you are talking about different genes across species. I started out thinking it was a cluster of genes on the same chromosome, and it wasn't until the last part that I realized these are homologous genes in different species.  It would be nice if that were clear right away.


 * But wait: as I read on, it seems that you switched to talking about them as I originally thought the term meant: a cluster of similar genes located close together within a chromosome.  So now I am confused.  What do gene clusters have to do with phylogeny?  Please try to make the lead more clear.


 * In general, your headings should be shorter, and you can remove "gene cluster" from them. So, for example, "Creation of gene clusters" -> "Creation".  (But, I like "Formation" better, I think.)


 * You don't need the sentence, "The process was described by Susumu Ohno in his book Evolution by Gene Duplication (1970)." Just the reference is fine.  In later sentences, you don't need "Ohno contended", "Ohno argued", etc., unless these statements are controversial.  If they are controversial, make that clear.  If not, then just state them as facts.


 * You're paragraphs should be shorter.


 * I would change the title of the section "Bioinformatics" to "Identification", or "Methods of identification".


 * I think it would be nice to expand that section a bit. What are the challenges of identifying gene clusters?


 * I think the outline of your article could use a little work. Right now, you have a lot of content in "Creation", and not very much in each of the other sections.  I'm not sure what to suggest, but you might take a look at some similar articles, to see how they are organized, and then think about moving your content around a bit.


 * I think you have found some very nice figures, and you make good use of them.

Keep up the good work! Klortho (talk) 05:23, 14 April 2014 (UTC)


 * Thank you so much for all your suggestions! We will take all of these into consideration as we work to improve upon the article. Kneal0627 (talk) 21:32, 14 April 2014 (UTC)
 * Hi, Klortho! Thanks again for all your suggestions! I submitted the article to turnitin, and I did not receive a similarity match for the section you described. Is there a specific portion of the section that was showing similarity? Several of your suggestions were implemented. The grammar errors were corrected, headings were changed and/or shortened, and sentences were removed. A vast amount of information was also added. Please let us know if you have any additional feedback. Thanks! Kneal0627 (talk) 22:24, 25 April 2014 (UTC)

Comments from PaleoBioJackie
Great job so far! There is a lot of great information in your article. Here are a few suggestions:


 * You're diagrams are great and there is a lot of information there, but they are too small to read! Especially the second photo, where one would have to click in to it to see it, but then click the back button to read the description. They cannot view the diagram and the description at the same time. It is easy to resize images, you need to add |300px| (or whichever size you choose) into the editing area of the photos. If this is unclear, try finding a Wikipedia article that has a larger image and click edit to learn how they made it bigger.


 * I would love to see an image closer to the beginning of the article to help draw people in.


 * The body of the article is good, but the introduction could use some work. It is a little difficult to understand and there are some grammatical errors, specifically plural vs singular form seems to switch within the same sentence a couple of times.
 * Thanks for the feedback. I wanted to address this point especially. In the next couple of weeks I'll be going through to correct grammatical and language issues to clear up the understanding. I agree, there are times that it is hard to understand. We just wanted to get the meat of the article down first. Thanks!Mnemcek (talk) 03:35, 23 April 2014 (UTC)


 * For the WikiLinks, if the context word you want to use is plural (i.e. proteins) but the article is singular, you could code : " proteins ". In context, this would look like: "Some proteins have been found to ....." for example.


 * It might be helpful to clarify in the intro: Are gene clusters found in each organism, or are they a means of comparing similar genes across several organisms/species?
 * Good point, I'll add this clarification to the intro. Mnemcek (talk) 03:35, 23 April 2014 (UTC)


 * The topic is written pretty well for a layman to understand, but I think there is still more that can be done in this area. Overall, good job making a tough subject easily accessible to non-scientists!
 * Yeah, this one was a tough topic to try to make understandable. If you have any other suggestions on how to clear things up, let us know! Mnemcek (talk) 03:35, 23 April 2014 (UTC)

PaleoBioJackie (talk) 18:12, 14 April 2014 (UTC)
 * I have tried playing around with the size of the images. This is the best I could come up with. Anything passed this causes the images to overshadow the text, making the article look funny. I'll keep trying though! Thank you so much for all your suggestions! We will take all of these into consideration as we work to improve upon the article. — Preceding unsigned comment added by Kneal0627 (talk • contribs) 21:35, 14 April 2014 (UTC)


 * PaleoBioJackie: If you use Chrome (and I think other browsers) you can shift-click on an image to see it full size in another window, without leaving the article. Maproom (talk) 14:47, 23 April 2014 (UTC)
 * Hi, PaleoBioJackie! The grammatical errors have been corrected. There was some content added to the intro; however, we will continue to work on it to further clarify it. A lot of content was also added to the article, so hopefully it will make the topic easier to understand. Thank you for mentioning that Maproom! Kneal0627 (talk) 22:28, 25 April 2014 (UTC)

Comments from Graeme
where you talk about related species, you had better mention that it's the same gene cluster. Hemoglobin is a

protein family, so it perhaps need slightly more explanation.

In the Coordinated gene expression, part the first sentence is unclear because we do not know what codominance is.

Better to explain what Coordinated gene expression, is first and then tell us the driving force and why it is so.

"A gene is duplicated during cell division" - this is completely normal, but it does not lead to "two end-to-end copies" normally, so please explain more clearly.

"It was theorized" better to say who theorized it. And this qualifer suggests that there is no evidence to support it. Whereas this idea may be widely supported.

"Gene converson" and "homogenized" both need to be explained in the sentence.

In the tandem arrays section there seems to be repetition about "essential"

Caenorhabditis elegans is a worm not a bacterium. Also Ciona intestinalis is not a procaryote.

It would be good to explain DAL and GAL a bit more as there are no Wikipedia articles on the topic yet.

more poor --> poorer

Terms to link in the lede: gene chromosome protein

More terms to link: Ciona intestinalis operon hybridizations bioinformatics (should be lower case b) dendogram chimaerism Digital transcriptome subtraction when talking about vector adaptor contamination 3p21.3 (mentioned so many times in Wikipedia but no article, closest is Chromosome 3 (human)); breast cancer lung cancer methylation, epigenetic Histone acetyltransferase

Grammar: organisms' (you are talking about one); " of genes on within" " despite it was initially thought" "programs exist which conducts"

Spelling to fix: ProtHox

Style: only the first word in a header should be capitalized.

Don't put terms in the see also section See_also that are used and linked in the article. So for example link gene families in your article. Graeme Bartlett (talk) 06:46, 26 April 2014 (UTC)
 * Thank you for your comments and concerns. We will take all into consideration as we work to improve upon the article. Kneal0627 (talk) 20:45, 29 April 2014 (UTC)

Comments from SSumpf

 * This article is coming along nicely with the prose that you have added. Great job on that! You may want to add references and images in addition to the 18 references and two images that you have. The language is neutral, unbiased, and informative.
 * The lead section comes before the table of contents and the first heading, so it follows the style guidelines. Terms such as homology, DNA, and hemoglobin are appropriately wikilinked in this section. There are no references in the lead, but I was informed in a previous comment that references are not necessary in the lead as long as they are discussed and cited elsewhere in the article. If this is correct, I believe you have cited the lead concepts elsewhere. Overall, the writing is clear and comprehensible in the lead section. You could add an image of the hemoglobin structure to visually support your last sentence.
 * I am curious what coinheritance is referring to in the “Coordinated Gene Expression” section. You could add more wikilinks in this section as well. For example, transcription factors, metabolic pathway, and genome. More wikilinks can be added throughout the article, in general (ie; divergence, tandem duplication, Ciona intestinalis, etc...)
 * I love the organization and structure of the article and do not see any need for improvement. Each subsection is appropriate for each section heading.
 * I would like to know specific examples for the statement, “A variety of programs exist in the Bioinformatics field which allows for easy analysis of gene cluster problems” in the “Algorithms” section.
 * The “Key algorithmic terminology” section is really confusing to me. I would like to see more references here as well as an example of a problem using these terms to put them into context and perhaps understand how they work better. Also, you use the same reference for “Algorithms”, “Key algorithmic terminology”, “Agglomerative hierarchical clustering”, “Self-Organizing Maps”, and “CLICK” sections, yet you state other authors in the prose. Perhaps you can incorporate prose from those articles, rather than the Sharan, et.al reference. It would be great to see 1-2 different references in each section along with images or specific examples of how they are used as these topics are confusing to me.
 * In the “Agglomerative hierarchical clustering” section, did you mean a “dendrogram”? You could provide a wikilink and an image and explain top-down or bottom-up better. In “self-organizing maps”, what is a node, what does it look like? What is a reference vector or an input vector? An image or example would help clarify this. Same thing for kernels in the “CLICK” section as well as “diametric clustering” and terms like “anti-correlaed” or “dominant singular vector”.
 * Overall, the writing is clear and avoids plagiarism. Some of the algorithms could be simplified or expanded upon with more examples or images for clarification.
 * Great job overall! I could tell you have spent a great deal of time and effort improving this article. Ssumpf (talk) 20:52, 27 April 2014 (UTC)
 * Please note in reference to your comment about the Bioinformatics section, specific programs are named corresponding to the type of algorithm followed. We understand that the Bioinformatics section may be difficult for some to understand; however, we are still working to further clarify this section. As for imaging, we are still investigating possible images that are not copyrighted. Thank you for all your comments and concerns! We will take all into consideration as we work to improve the article. Kneal0627 (talk) 20:51, 29 April 2014 (UTC)
 * Thank you, Kneal10627! Good luck on your final post. I had trouble finding images outside of wikicommons as well.Ssumpf (talk) 01:53, 30 April 2014 (UTC)
 * I am curious if you tried using any image creation software as opposed to searching for images? I had a hard time using the software suggested in class, but perhaps you would have better luck. I would definitely like to see more images and clarification on the bioinformatics section. It may be as simple as defining the language more or adding examples of how the algorithms are used in non-technical language. Ssumpf (talk) 19:33, 3 May 2014 (UTC)
 * I also found the software difficult; however, I have found some images that I may be able to use. I'm going to work on clarifying this section tonight. Kneal0627 (talk) 14:04, 5 May 2014 (UTC)

Many misunderstandings
I do not share other editors' positive views about recent changes to the article. It appears to me that misunderstandings have been introduced.

In the lede:
 * Genes do not code for proteins. They code for peptides. A protein typically comprises several peptides.
 * The definition of "gene cluster" should state that the genes in it are homologous.
 * Just a couple of notes here: genes do code for proteins, in that the products of genes are proteins. Making the distinction in this article is not necessary. Yes, there are steps between the transcribed gene and the final protein product, but that is beyond the scope of this article. Mnemcek (talk) 19:12, 6 May 2014 (UTC)


 * "Because of the homology of the DNA sequences, the presence of gene clusters on the same chromosome suggests a close evolutionary relationship between two species." No, it suggests nothing. The homology is part of the defintion. Their presence on the same chromosome suggests that the origin of the cluster was by gene duplication, as described by Ohno.

In the "Coordinated Gene Expression" section, I find the first sentence hard to believe. It ignores the mechanism proposed by Ohno as the origin of gene clusters, and suggests two mechanisms which may explain the subsequent divergence of the genes within a cluster, but cannot explain its origin. The reference cited is behind a paywall, and I have been unable to read it; but its abstract, which is freely readable, is about identification of gene clusters, not about their origin. I also wonder if, when this section says "gene expression", it means "gene regulation". Maproom (talk) 14:56, 28 April 2014 (UTC)

And most seriously, the word "clustering" appears 19 times in the article. In every instance, this is a result of an editor confusing genetic clustering (as described at Human genetic clustering) with gene clusters. These are entirely different phenomena. A gene cluster is a group of genes within a genome that are homologous in origin and have related functions. Genetic clustering is the use of cluster analysis to study the degree of relatedness of populations. Maproom (talk) 15:16, 28 April 2014 (UTC)


 * Thank you for expressing your concerns pertaining to the article. We will take all your concerns into consideration as we work to improve upon the article. Kneal0627 (talk) 20:54, 29 April 2014 (UTC)


 * Hi, Maproom! I apologize for the delay. I have found some other articles, which are available free to everyone, to further explain the Coordinated Gene Expression section. I will be adding this later this evening. Thank you for your patience! Kneal0627 (talk) 14:06, 5 May 2014 (UTC)

Suggested words to wikilink

 * chromosome (first seen in lead)
 * transcription factors (first seen in Formation)
 * metabolic (first seen in Formation)
 * divergence (first seen in Gene Duplication)
 * embryonic cell (first seen in Tandem arrays)
 * Ciona intestinalis (first seen in Types)
 * Homogeneity (first seen in Algorithms)
 * Dendogram (first seen in Agglomerative hierarchical clustering)
 * Etiology (first seen in Research)
 * Tumor (first seen in Research)
 * Cancer (first seen in Research)
 * Pathogenesis (first seen in Research)

Lead

 * Your lead doesn't have any sources associated with it. I understand that the lead section isn't required to have sources in it, but that opens up the possibility of people challenging the information contained in the lead.
 * The first sentence of your lead seems rather long being that it contains 5 different facts about gene clusters. It might be easier for the reader to understand if you break this sentence apart.
 * Wikipedia's information on lead sections dictates that, "if the subject of the page has a common abbreviation or more than one name, the abbreviation (in parentheses) and each additional name should be in boldface on its first appearance." You did a good job putting gene cluster in bold, I would recommend doing the same with gene family.
 * I find your lead very interesting. The point you make about the relationship between gene cluster locations within chromosomes and evolutionary relationships between species is intriguing.  The style of writing is neutral and I didn't detect any grammatical errors.

Formation

 * Again, the first sentence here has at least 5 different facts. I would recommend trying to reformat it to: "Coordinated gene expression is also known as co-expression as a result of codominance.  It is considered to be the most common mechanism driving the formation of gene clusters.  However, coinheritance has also been considered as a driving force for the formation of gene clusters."  It is just a matter of personal opinion but I find it easier to read.
 * The second sentence follows the exact same format as your first sentence. You have multiple sentences separated by ";however," and it could easily be broken up.
 * I made mention above that you could wikilink divergence to the genetic divergence wiki page. While it is still a stub, I think it could inspire other people who are reading your article to update it.
 * There are several words that are wikilinked in your section titled Gene Duplication. They were mentioned earlier in the article and only wikilinked after several uses of the word.  I imagine this is because you started with this section and added the previous sections later.  Those words are chromosome and evolution.  You can wikilink these in the lead and un-wikilink them in this section.
 * The sentence, "Because there was only a single copy of the gene, they could not undergo mutations which would potentially result in new genes; however, gene duplication allows essential genes to undergo mutations in the duplicated copy, which would ultimately give rise to new genes over the course of evolution," is worth considering reformatting.
 * Likewise, the following sentence could be reformatted: "Over a short span of time, the new genetic information exhibited by the duplicated copy of the essential gene would not serve a practical advantage; however, over a long, evolutionary time period, the genetic information in the duplicated copy may undergo additional and drastic mutations in which the proteins of the duplicated gene served a different role than those of the original essential gene."
 * " It is unknown the exact number of genes contained in the duplicated Protohox cluster; however, models exist suggesting that the duplicated Protohox cluster originally contained four, three, or two genes." How about:  The exact number of genes contain in the duplicated Protohox cluster is unknown.  There are models that suggested the duplicated Protohox cluster originally contained anywhere from two to four genes.  Perhaps I am overlooking the reason you numbered in the order "four, three, or two genes."
 * "Loss of genes is dependent of the number of genes originating in the gene cluster." Maybe change to:  Loss of genes is dependent on the... or The loss of genes is dependent on the...
 * "The formation of the Hox cluster and the ParaHox cluster were a result of intrachromosomal duplication despite it was initially thought to be interchromosomal." Add the word "that" in between "despite" and "it".
 * In closing, I feel like after reading this section I was able to understand a complicated topic. You're on the right track.

Gene clusters vs. tandem arrays

 * "Portions of the DNA sequence of a gene is found to be identical in genes contained in a gene cluster." Change "is" to "are".
 * "Gene clusters change over a long evolutionary time period, which does not result in genetic complexity." Maybe explain why genetic complexity doesn't increase as a result of change.  I'm curious.
 * In the last paragraph of Tandem arrays, you could have chosen to wikilink ribosomal RNA earlier in the sentence. It's not a huge deal but consistently wikilinking words at their first appearance is more desirable than doing so after the word has been used several times.
 * You did a nice job comparing the differences between these two items of interest.

Types

 * I am curious as to why you wikilinked plants and insects in the first sentence of Eukaryotic gene clusters. It seems unnecessary.
 * You wikilinked yeast twice in the first two sentences of Eukaryotic gene clusters. Not a big deal, just pointing it out.
 * Eukaryotes is wikilinked again unnecessarily in the section Eukaryotic gene clusters.
 * Horizontal gene transfer is wikilinked twice in the section Eukaryotic gene clusters.
 * Are there any other differences between the two types of gene clusters? It is an interesting section.

Detection

 * Unnecessarily wikilinked genome.
 * " The data found within the matrix demonstrates the level of expression for each gene specific for a type of condition or length of time." This sentence is confusing, consider rewording.
 * This would be a good section to have a picture of results from a gene expression experiment.
 * Thanks, I added a microarray analysis image to this sectionMnemcek (talk) 18:56, 6 May 2014 (UTC)


 * "Elements, which are commonly genes, and a characteristic vector for an element, which is a gene's pattern of expression, make up a clustering problem." Consider rewording.
 * "In contrast, separation is defined as genes found in different clusters exhibit low similarity to one another." Add a "that" between "clusters" and "exhibit".
 * You could consider doing an outside link to the software programs Cluster, GeneCluster, and CLICK (although I did have trouble finding a main page for CLICK).
 * "Kernels" are the basis for clusters. Perhaps you could elaborate further.
 * It's nice to have multiple pairs of eyes reading over this work. It is easy to miss little mistakes like these. Thanks for your input.Mnemcek (talk) 18:59, 6 May 2014 (UTC)

Overall I really enjoyed learning about gene clusters. Your overall layout is very fluid and works nicely. I hope my suggestions are helpful in your future endeavors.Previte01 (talk) 02:46, 30 April 2014 (UTC)
 * Thank you for your comments and concerns. We will take all of these into consideration as we work to improve upon the article. Some of the sentences you suggested for rewording cannot be revised since it would be close paraphrasing to the original, which is considered plagiarism. I will try to figure out something to make it easier to the reader though. Also, do you know how to provide an outside link on a Wikipedia page? Kneal0627 (talk) 14:11, 5 May 2014 (UTC)
 * I hope you don't mind if I add my few cents over here. To add an external link you simply have to click on the link sign in the editing toolbar just like you do for wikilinking. Only difference is that you select external links. Eg.: Cluster Software Hope this helps! Mmehta10 (talk) 01:24, 7 May 2014 (UTC)

Gene cluster article
The following is copied from Wikipedia_talk:WikiProject_Genetics. Maproom (talk) 11:06, 19 January 2015 (UTC)

Eight months ago, I wrote


 * In mid-March, two students announced on the talk page of the Gene cluster article, that they were planning to improve the article, as a college project. They requested other editors not to edit the article until their project was assessed, om May 7th. They then made many changes to the article, adding a lot of new material.


 * Other editors praised their efforts. I criticised them, as I believed they were incorporating errors and misunderstandings into the article. They accepted some of my criticisms, and made some corrections.


 * Their deadline is now a week past, and I assume that their project is over, though they and their professor have given no feedback on it. I believe that they have made many improvements to the article, most notably the addition of material about Hox genes and the Homeobox family. But I also believe that some of the errors they introduced are still there, and should be removed.


 * However, I believe that I am not the best person to clear up the errors. While I believe I am technically competent to do it, I feel some "commitment" to the article, which must be a bad thing. I would prefer another editor to take a lead here. I have already stated many of my views on the errors, on the article's talk page, and can provide further details if asked. Maproom (talk) 11:19, 15 May 2014 (UTC)

No-one responded (and today, coincidentally, my request was archived). I am planning to work on the article myself soon. I shall copy this to its talk page.

While the students made many improvements to the article, they added a long section on formation, discussing various theories about the origin of gene clusters. But this is absurd; the origin of gene clusters (by duplication and divergence) was known in 1972, and is not in doubt. This is acknowledged in the second sentence of the article "A gene cluster is part of a gene family": The gene family article starts "A gene family is a set of several similar genes, formed by duplication of a single original gene."

I will replace the long "formation" section by a much shorter historical section, mentioning the various pre-1970 conjectures. Maproom (talk) 11:01, 19 January 2015 (UTC)

Definition of "gene cluster"
The first paragraph of the article currently says that the members of a cluster "are located within a few thousand base pairs of each other"; and that "Genes found in a gene cluster may be ... on different ... chromosomes." I could accept either, but not both at once. I will appreciate some help in sorting out this mess. Maproom (talk) 23:22, 14 February 2015 (UTC)


 * With the recent addition of a subsection on "Metabolic gene clusters", the article is drifting further from its purported subject, as explained in its lead, which is groups of homologous genes within one organism. My view is that the entire "Types" section should be removed as off-topic, and possibly transferred to some other article. I would like to hear the views of other editors. Maproom (talk) 07:22, 12 July 2016 (UTC)

External links modified
Hello fellow Wikipedians,

I have just modified one external link on Gene cluster. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
 * Added archive https://web.archive.org/web/20100528103947/http://www.pitt.edu/~biohome/Dept/pdf/378.pdf to http://www2.pitt.edu/~biohome/Dept/pdf/378.pdf

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

Cheers.— InternetArchiveBot  (Report bug) 01:27, 9 January 2017 (UTC)

Need for disambiguation
I was shocked that the Gene cluster wikipedia page was devoted solely to a specific type of gene clusters (albeit very important to evo-devo studies), and ignores the more widespread use of the term, which includes secondary metabolism clusters and others from bacteria, fungi, plants and others. My quick review of the history of the article here seems to point to these getting pushed out some time ago because they did not fit the author's constrained definition of gene cluster. In fact, in comparative genomics, "gene cluster" has at least 4 connotations: 1. Clusters of orthologous genes (similarity groups), 2. Hox/tandem paralog clusters, and 3. Metabolic gene clusters (tightly linked non-homologous genes contributing to a common function), and 4. genes with coordinated expression in transcriptomes.

Probably the most parsimonious solution would be to have a disambiguation page, and rename the current gene cluster page into what it actually is, the Hox gene cluster page. Slotjc (talk) 18:46, 15 March 2018 (UTC)
 * I don't mind this solution. Natureium (talk) 18:51, 15 March 2018 (UTC)
 * A bit of history may be relevant here. A few years ago Gene cluster was about sets of homologous genes. Two college students were assigned the task of improving it. Much of what they did made it worse, as they added topics which weren't related to that sense of gene cluster, and a lot of material debating why gene clusters exist (though their origin by duplication and divergence had been established decades previously). Once their project was over, I removed the irrelevant material. But I left everything relevant that they'd added, much of it being about about the Hox clusters. (BTW I've already seen your new page Metabolic gene cluster. I'm impressed.) Maproom (talk) 19:16, 15 March 2018 (UTC)
 * I see now that there's a proposal to merge Gene cluster and Metabolic gene cluster. I strongly believe that this would be a mistake. They are quite distinct topics, and this should be made clear to readers. But I'm opposed to the idea of a DaB page. Wikipedia guidelines recommend the use of a DaB page only when there are three or more topics that might be confused. When (as here) there's only two, there should just be a hat note on each article, linking to the other. has already done that. Maproom (talk) 19:26, 15 March 2018 (UTC)
 * Would it be more appropriate to refine the title of the "Gene cluster" page to "Hox gene cluster"? The hat note links would still be appropriate DaB until somebody from the comp. gen. world wants to write a page about the other kinds of gene clusters. I will continue to develop the MGC page by sections over the coming months, but will not likely contribute to the others. I just wanted an appropriate page for my field to be explained. If the pages were eventually merged (probably not the best solution) there would have to be an introductory framework about genome structure that encompasses both.Slotjc (talk) 20:06, 15 March 2018 (UTC)
 * The article was not created to be about the Hox gene cluster, but about gene clusters (sets of homologous genes, originating by duplication) in general. Wikipedia already has an article Hox gene. When I first came across "gene clusters" in 1974, the canonical example was myoglobin genes in mammals, I don't think Hox genes had been discovered. (I agree that the article is still quite a mess. When the two students had finished with it, I reversed much of what they'd done, and would have liked to go further but felt it would have been rude.) I wouldn't object to a name change – I don't know what this subject is usually called now. Maproom (talk) 20:54, 15 March 2018 (UTC)
 * This is definitely field-specific. Most bacteriologists or fungal biologists would think sets of unrelated co-located genes contributing to a common function. This is really what the Fisher model and the Coregulation models were all about. Easy to explain how paralogs came to be clustered, but not how functionally associated genes did. I'm not sure when the GAL or Quinic acid clusters were first called clusters (I turned 1 in 1974), so I don't know which field has precedence. But the two are still conceptually related, even though evolutionarily and functionally distinct. I say we continue to improve both pages as they are now divided, and consider a merger under an introduction that details the different connotations at some point in the future.

Slotjc (talk) 20:11, 16 March 2018 (UTC)