Wikipedia:Reference desk/Archives/Mathematics/2019 March 25

= March 25 =

Optimizing an unknown classifier
Suppose you have some function which is unknown, except that you know that it classifies some search space $$A$$ (for example, strings in natural languages), i.e., you have an unknown classifier $$f:A \rightarrow \{0,1\}$$.

The goal is to find some $$a$$ such that $$f(a)=1$$.

If $$f$$ were known, then you could use standard search techniques on $$A$$ to find some $$a \in A$$ such that $$f(a)=1$$.

But $$f$$ is unknown, of course. What, then, is to be done?

Suppose you can generate some $$a_1,...,a_n \in A$$ and test whether $$f(a_i) = 0$$ or $$f(a_i) = 1$$. Then you might think you could use some standard supervised machine learning algorithm to estimate $$\hat f$$ and then optimize the estimate, i.e., use search to find $$a$$ such that $$\hat f(a) = 1$$.

The problem is that, if $$f(a_i) = 0$$ for every $$i$$, then any supervised learning algorithm is going to return the zero function as the estimate; $$\forall a \in A: \hat f(a) = 0$$, because that is the simplest function consistent with the data points. Such an estimate would say there is no $$a$$ such that $$f(a)=1$$, which we assume is wrong.

I suspect a different search / machine learning paradigm is needed: perhaps one that takes advantage of the structure of $$A$$ and perhaps uses probability, and perhaps uses other data.

For example, if you have some metric on $$A$$, and you conclude that $$\forall i: f(a_i)=0$$, then the next $$a_i$$ you try is one which is far away from the ones you have already tried, because that is most likely to have $$f(a_i)=1$$ ($$a_i$$s which are similar are likely to be classified the same way as 0).

But this doesn't seem quite right either. Suppose $$A$$ is a set of strings, and suppose you have cause to believe that $$f(a) = 1$$ is more probable if $$a$$ is a valid sentence in English. Then you wouldn't want to go too far away from the sentences you have already tried, because if $$a$$ is a sentence in Chinese, it's then less likely to have the desired property (but you can't rule out the possbility).

The point is, I don't have a principled foundation for solving this problem.

By the way, if there is a better medium for asking questions like this, let me know (I'm not sure if there's an appropriate Stack Exchange). --49.183.160.223 (talk) 13:19, 25 March 2019 (UTC)
 * The situation you're describing seems very general, perhaps too general for there to be a specific answer. You've got some kind of metric on A but without knowing what it is or how it relates to f, so it seems like it would be difficult to apply that information. Also from you're describing it sounds like you're likely never to get a f(a) = 1 with random inputs a, so it seems unlikely that guided learning will bit be effective. The most you could hope for is that you'll train the AI toward F(x)=0 as the estimate for f. But I don't claim to be an expert on AI and perhaps another forum would be better; there is a stack exchange for AI you might try. --RDBury (talk) 14:28, 27 March 2019 (UTC)

If the a's with f(a)=1 are very sparse, then you have a needle-in-haystack problem and there's not much you can do. If there was, cryptography wouldn't work. If you've got a mythical quantum computer, Grover's algorithm gives a quadratic speedup over brute force, but with a large enough search space even that won't help much. In other words you have to know something about f, or have finer grain classification than binary, or whatever. 67.164.113.165 (talk) 17:34, 28 March 2019 (UTC)

How to calculate a percentage before and after it increases by a certain standard deviation
How do you calculate the percentage of something before and after it increases by a certain standard deviation? For instance, if 12% of Canadians want to join the U.S. and this figure increases by 1.4 standard deviations, what percentage of Canadians are now going to want to join the U.S.?

Also, please clearly show me all of the steps so that I could understand how to do this every single step of the way. Thank you. Futurist110 (talk) 22:16, 25 March 2019 (UTC)
 * Firstly (and assuming that we are following a Normal distribution), you need to find the z-score for p=0.12. Looking up a Normal distribution table or using an online calculator gives an answer of about z=-1.175. Then, you add 1.4 to that z-score which gets you z=0.225. And finally, to get your answer, you find the p-value for the new z-score with your table/calculator from earlier, which gives the final answer of about p=0.589, so ~58.9% of Canadians now want to join the U.S. Iffy★Chat -- 23:17, 25 March 2019 (UTC)
 * Thank you very much for your great explanation here, Iffy! Futurist110 (talk) 19:23, 31 March 2019 (UTC)