User:Aly-khan madhavji/Computer Science: Reinforcement Learning from a student

Reinforcement Learning
Reinforcement learning is a form of artificial intelligence in which the machine learns which option has the most benefit to produce the best long term results ultimately improving its performance by the concept of trial and error. The Machine discovers which option will be most beneficial from its current position and then it will pursue that option through this trial and error process. This type of learning is used in interactive problems as many new scenarios and possibilities arise. These interactive problems are seen in scenarios that are most commonly presented in games and computer programs.

Steps Taken in the Process of Reinforcement Learning (Algorithms)
Reinforcement learning can be represented by an equations with variables that represent three aspects: situations, actions taken, and finally the results. Actions depend on the situation as modeled by A(s) where actions act as a function of the situation. The situation is the independent variable where as actions (A) is the dependant variable (actions taken depend on the situation/scenario presented). Actions taken in a situation are done in order to produce results as close to the desired outcome/results as possible. Results can be represented by the variable ( R ) where the actions taken (A) dependant on the situation (s), produce ( R), results.

Depending on certain situations - (environment) outcome is partially controlled in that actions are chosen due to what their affect is predicted to be. But there is uncertainty in the final outcome. Which cannot be perfectly predicted before hand. From deviations from the prediction of the outcome one can learn and adapt actions by trying new things to achieve results as close to the desired outcome as possible. In some cases a result may not be consistent with a certain action, and to accurately determine the result of certain actions, a specific action may have to be done a large number of times. This is a stochastic method, in that the same action is repeated a large number of times to determine the results based on these actions.

The Chess Phenomenon
Gary Kasparov is known as the worlds greatest chess player. Throughout the 1990’s he matched up against the worlds most advanced reinforced learning systems of artificial intelligence. Mr. Kasparov constantly won tournaments against advanced IBM super computers until May 1997. A controversial chess tournament which saw the Chess World Champion lose two matches to one with three draws against the IBM machine named Deep Blue. This was a breakthrough victory for IBM but most importantly for the development of artificial intelligence and more particularly reinforced learning. "When IBM's chess-playing computer Deep Blue defeated Gary Kasparov in 1997, a pall fell over the chess world. It was clear that computers had come to dominate the game of chess.” - Robert L. Smith (2003).  Gary Kasparov quickly criticized IBM of cheating, as they had done upgrades between matches to ‘Deep Blue’.  The super computer was quickly dismantled after the match and has earned its place in history as the intelligent computer that beat the chess world champion.  This is a leap forward for the technological sector and developing computers and artificial intelligence have become the primary focus for the  industry.

Research
Constantly, companies are striving towards improving their systems and enhancing them to perform more difficult and complicated tasks. Rutgers, an Artificial Intelligence research company, has committed themselves to creating intellectual agents by developing learning algorithms that work towards maximizing return. They seek to improve reinforcement learning by paying attention to everyday situations. Many of the competitive Artificial Intelligence based companies follow this inspirational quote "Find a bug in a program, and fix it, and the program will work today. Show the program how to find and fix a bug, and the program will work forever." - Oliver G. Selfridge. This is truly the motto for reinforcement learning as machines are being programmed to find and solve problems independently and effectively. Artificial intelligence is a dynamic field always improving and it is believed that in the future humans will become dependant on machines at the mercy of reinforced learning.