User talk:O18/Archives 2

Bayesian stuff
I really think you're making way too big a deal out of something that is really not a very big deal. If you make a statement like "There is a 95% probability that between 36% and 44% of the voters want to vote for the party in question." then obviously it depends on the assumptions you've made in your model. But this is the case with every scientific statement about anything. There is nothing in Bayesian statistics that's any more subtle than what applies to any scientific discipline. The point I'm trying to make in the article on confidence intervals is a very simple one: Bayesian statistics provides you a model for making probabilistic statements about uncertainty due to lack of knowledge, while frequentist statistics does not. This is something that's important for a non-expert reader to know. You seem to think it's better not to state this at all than to make a point that isn't completely, utterly technically correct. I'm actually rather surprised by this, as in general you seem to share my viewpoint that Wikipedia articles need to be addressed to the non-expert reader rather than to the expert reader -- an excessive focus on technical issues is exactly what makes Wikipedia articles overly complicated and unreadable. Rather than constantly reverting my changes because you find some small fault in them, please try to be constructive and figure out how better to express the basic issue! If you don't like the way I've phrased the reason why people might use frequentist statistics instead of Bayesian statistics, then help me find a better way of phrasing it. Is it enough to say that the use of Bayesian inference "requires additional, subjective assumptions that may limit the circumstances in which the conclusions can be applied"? The point is to try and educate the lay reader on what the major issues and alternatives are, not to provide a cookbook explaining exactly how and when the alternatives should be applied; honestly I think it's plenty enough simply to note that there are additional complexities or subtleties in Bayesian statistics that lead some to favor frequentist statistics despite its limitations, without needing to state what those complexities are. Benwing (talk) 01:13, 20 October 2010 (UTC)
 * The fact that the interpretation of confidence intervals is not what people think is not a "precise, technical" issue, it's a fundamental fact of confidence intervals. This is not at all the same as the fact that Bayesian results are inapplicable if your model assumptions are false, which is a basic fact of all scientific results.  I think it's important to address the important points in an article, which includes a discussion of alternatives.  Nonetheless, I'm going to leave this article alone for a couple of days and revisit it later.  When I do, rather than start an RFC, I'll follow WP:BRD; I believe this to be more effective, as I've found that there is little incentive to address concerns unless positive action (e.g. revert) is required to do so. Benwing (talk) 02:37, 21 October 2010 (UTC)
 * OK, I get the feeling I'm not quite understanding what you're saying. When you say "I'm saying that when you present Pr(A|B,C) you had better make sure that the B and C are in the statement", can you elaborate what you mean by B and C?  Do you mean the assumptions you've made about your prior distribution, and other assumptions that go into your model?  Or is there something else you're referring to?  If it's the former, would you be satisfied by something like "In the case of the voting poll described above, assuming that voters have made up their minds, the uncertainty about the true percentage of votes for the party in question is due to the fact that the choices of voters who have not been polled is unknown, rather than due to any actually random behavior on the part of the voters.  The paradigm of frequentist statistics does not allow for probabilistic reasoning about uncertainty of this sort, due to lack of knowledge rather than objectively random events.  Bayesian statistics does allow for such reasoning; however, it requires significant additional assumptions about the unknown quantities that can be difficult to justify objectively, and without which the results are meaningless." This makes very clear the respective limitations of the different approaches.  I don't explicitly mention the model assumptions other than those related to the prior distribution, but these are going to apply to both approaches and in fact to any scientific model. Now in point of fact, the frequentist approach does make assumptions about the prior distribution, specifically that it's uniform; but that is a separate issue. (Part of the reason I'm skeptical about frequentist statistics is that in my line of work, which is natural language processing, assuming a uniform prior is very often drastically wrong and produces completely nonsensical results.) Benwing (talk) 05:01, 21 October 2010 (UTC)
 * Thanks for continuing to engage me and try to find consensus; I really appreciate that. I'm aware that I'm not always the most diplomatic of people, esp. when I get frustrated.
 * The reason I want to say something to the effect of what I wrote just above is that I think the issue of what confidence intervals actually mean is a pretty basic fact that needs to be discussed in a way that a non-expert will understand. I do see that there are two sections about Bayesian alternatives, but the more useful section (the first section under "Alternatives", before "Philosophical issues") is way down near the bottom.  Here's a suggestion for what I'd like to do:
 * In "Introduction", put back some of the text I wrote that discusses the trickiness in interpreting the results, without the Bayesian reference.
 * In the lead, take out the sentence "Confidence intervals are used in frequentist statistics; the equivalent in Bayesian statistics is the credible interval." and instead, put an extra paragraph at the end of the lead that says:
 * Confidence intervals are used in frequentist statistics. The equivalent in Bayesian statistics is the credible interval.  Bayesian statistics does provide a method for computing the probability that the true parameter value lies in a given interval, but comes with its own issues and limitations; see below for more information.
 * Also, in the "Alternatives" section, is the following statement about confidence intervals actually true?
 * "Users of Bayesian methods, if they produced an interval estimate, would in contrast to confidence intervals, want to say "My degree of belief that the parameter is in fact in this interval is 90%,"[9] while users of prediction intervals would instead say "I predict that the next sample will fall in this interval 90% of the time."[citation needed]"
 * If you are basically OK with this, then I might go ahead and make these changes and let Melcombe revert if he wants (WP:BRD style). Based on my past dealings with Melcombe, it appears he doesn't like me very much and would rather simply obstruct me than engage in a genuine dialog, as you've been doing. Benwing (talk) 03:24, 22 October 2010 (UTC)
 * I forgot to answer your question about prior distributions. AFAIK maximum likelihood can be described as frequentist, and maximum likelihood is exactly equivalent to maximum a posteriori with a (possibly improper) uniform prior distribution.  In many NLP applications you need a highly non-uniform prior in order to get decent results.  For example, in a topic model it is not unusual to use prior distributions that are symmetric Dirichlet distributions with a concentration parameter of 0.001 or even 0.0001; this is because a "topic" is technically a distribution over however many words are in your vocabulary, and if you have a vocabulary of 1,000,000 words (not uncommon if you construct your vocabulary based on a large text corpus), you absolutely do not want your topic smeared more or less equally over all 1,000,000 of these!  Even worse, imagine that you have a corpus consisting of parse trees over 40,000 sentences, and you want to learn a tree substitution grammar from this corpus, which is kind of like a context free grammar (CFG) but where instead of just having rules that expand a single non-terminal parent into its children, your rules can be arbitrarily-sized, anywhere from a simple CFG rule to a rule that expands the entire sentence at once.  If you apply expectation maximization (EM) to this problem without some prior distribution that favors small rules, your EM algorithm will come back saying, "OK, I learned 40,000 rules, each of which expands an entire sentence, and by the way I did a really good job, since p(data|rules) = 1". This is why most computational linguists are committed, die-hard Bayesians.Benwing (talk) 08:11, 22 October 2010 (UTC)
 * At the same time I totally understand how not everyone is so enamored of having to choose a prior distribution. But what about using non-informative priors?  What happens e.g. in the case of the vote poll example, if you do a Bayesian analysis and use a non-informative prior?  Presumably you still get out a credible interval. Benwing (talk) 08:11, 22 October 2010 (UTC)
 * OK, you made a lot of interesting points but didn't actually respond to anything I mentioned regarding the page. As I said, I'm trying to establish consensus with one person at a time, which I think will improve the signal-to-noise ratio.  BTW as for your air-traffic example, the short answer is that specific info like "altitude from the ground" doesn't go into the prior at all; rather, it goes into the features.  Priors in Bayesian modeling in NLP are only used to express general biases like prefering sparse solutions or perhaps biasing towards high-info or low-info words (you might have a mixture model with separate components for high-info and low-info words).  The real reason why voice-recognition currently isn't reliable enough is that it's a really really tough problem that requires much more sophisticated models than we currently have.  In fact most production-level systems that suck so badly aren't Bayesian at all, but are just doing basic EM to learn.  Bayesian models using MCMC and Dirichlet processes and such tend to be more accurate but they're very new, still at the forefront of research. Benwing (talk) 22:36, 22 October 2010 (UTC)
 * If our conflict is "irreducible" then it basically means you're going to obstruct anything I suggest so there's no point in talking on the talk page. In such cases I would honestly rather just make the changes I want and force you to revert; at least then there is a possibility of finding an actual consensus. Benwing (talk) 07:21, 23 October 2010 (UTC)
 * Wikipedia is not a democracy so 2 against 1 is not a valid reason for doing or not doing something. Nonetheless I'm going to leave this alone for now as I'm busy, but when I have time I'll go ahead and make my edits; perhaps you will be surprised after all. Benwing (talk) 00:41, 24 October 2010 (UTC)

United States public debt
This is where I got the US Federal debt as % of GDP numbers - http://www.usgovernmentspending.com/federal_debt_chart.html; and the Tax brackets are here - http://www.ntu.org/tax-basics/history-of-federal-individual-1.html. I don't have the time, was just trying to help out... Geek2003 (talk) 12:29, 13 October 2010 (UTC)

I've drafted a Wikipedia article: User:Csdidier/Public_Debt_Vocabulary_Shift. I was wondering if you would be willing to give it a quick look, and offer some criticism.Csdidier (talk) 16:06, 26 November 2010 (UTC)

Debt
Sorry, but your revert on the Obama increase of debt in 2010 does not match the figures shown. All the figures (percentage point increases) are calculated from the adjusted debt (Dx) numbers by dividing current year by previous year and substracting 1 (i.e. 100%), e.g. % = (Dn/Dn-1)-1. There is effectively no way how the Obama increase could be around 6%, an should be above 30% (exactly 31.6%). The definition of percentage point does not change anything on this, just read through the article on percentage point increases, which you have added to the heading. — Preceding unsigned comment added by 145.62.32.131 (talk) 16:46, 15 July 2011 (UTC)

Your old talk page
So just to clarify, you wanted to change your name for privacy purposes? --  At am a  頭 19:59, 15 August 2011 (UTC)
 * Yes. 018 (talk) 20:07, 15 August 2011 (UTC)
 * I will delete it then, because that actually is one of the exceptions where we will delete a user talk page. I'll do so now, thank you. --  At am a  頭 20:19, 15 August 2011 (UTC)
 * Okay, thanks for helping with me. 018 (talk) 20:25, 15 August 2011 (UTC)