User:CaveatLector2022/Tests

Financial markets
The AI technology company c3.ai saw a 28% increase in its share price after announcing the integration of ChatGPT into its toolkit. The share price of BuzzFeed, a digital media company unrelated to AI, increased 120% after announcing OpenAI technology adoption for content creation. Reuters found that share prices of AI-related companies BigBear.ai and SoundHound AI increased by 21% and 40%, respectively, even though they had no direct connection to ChatGPT. They attributed this surge to ChatGPT's role in turning AI into Wall Street's buzzword. Academic research published in Finance Research Letters found that the 'ChatGPT effect' prompted retail investors to drive up prices of AI-related cryptocurrency assets despite the broader cryptocurrency market being in a bear market, and diminished institutional investor interest. This confirms anecdotal findings by Bloomberg that, in response to ChatGPT's launch, cryptocurrency investors showed a preference for AI-related crypto assets. An experiment by finder.com revealed that ChatGPT could outperform popular fund managers by picking stocks based on criteria such as growth history and debt levels, resulting in a 4.9% increase in a hypothetical account of 38 stocks, outperforming 10 benchmarked investment funds with an average loss of 0.8%.

Conversely, executives and investment managers at Wall Street quant funds (including those that have used machine learning for decades) have noted that ChatGPT regularly makes obvious errors that would be financially costly to investors because even AI systems that employ reinforcement learning or self-learning have had only limited success in predicting market trends due to the inherently noisy quality of market data and financial signals. In November 2023, research conducted by Patronus AI, an artificial intelligence startup company, compared performance of GPT-4, GPT-4-Turbo, Claude2, and LLaMA-2 on two versions of a 150-question test about information in financial statements (e.g. Form 10-K, Form 10-Q, Form 8-K, earnings reports, earnings call transcripts) submitted by public companies to the U.S. Securities and Exchange Commission. One version of the test required the generative AI models to use a retrieval system to find the specific SEC filing to answer the questions; the other gave the models the specific SEC filing to answer the question (i.e. in a long context window). On the retrieval system version, GPT-4-Turbo and LLaMA-2 both failed to produce correct answers to 81% of the questions, while on the long context window version, GPT-4-Turbo and Claude-2 failed to produce correct answers to 21% and 24% of the questions, respectively.

Medicine
In the field of health care, possible uses and concerns are under scrutiny by professional associations and practitioners. Two early papers indicated that ChatGPT could pass the United States Medical Licensing Examination (USMLE). MedPage Today noted in January 2023 that "researchers have published several papers now touting these AI programs as useful tools in medical education, research, and even clinical decision making."

Published in February 2023 were two separate papers that again evaluated ChatGPT's proficiency in medicine using the USMLE. Findings were published in JMIR Medical Education (see Journal of Medical Internet Research) and PLOS Digital Health. The authors of the PLOS Digital Health paper stated that the results "suggest that large language models may have the potential to assist with medical education, and potentially, clinical decision-making." In JMIR Medical Education, the authors of the other paper concluded that "ChatGPT performs at a level expected of a third-year medical student on the assessment of the primary competency of medical knowledge." They suggest that it could be used as an "interactive learning environment for students". The AI itself, prompted by the researchers, concluded that "this study suggests that ChatGPT has the potential to be used as a virtual medical tutor, but more research is needed to further assess its performance and usability in this context." The later-released ChatGPT version based on GPT-4 significantly outperformed the version based on GPT-3.5. Researchers at Stanford University and the University of California, Berkeley have found that the performance of GPT-3.5 and GPT-4 on the USMLE declined from March 2023 to June 2023.

A March 2023 paper tested ChatGPT's application in clinical toxicology. The authors found that the AI "fared well" in answering a "very straightforward [clinical case example], unlikely to be missed by any practitioner in the field". They added: "As ChatGPT becomes further developed and specifically adapted for medicine, it could one day be useful in less common clinical cases (i.e, cases that experts sometimes miss). Rather than AI replacing humans (clinicians), we see it as 'clinicians using AI' replacing 'clinicians who do not use AI' in the coming years."

An April 2023 study in Radiology tested the AI's ability to answer queries about breast cancer screening. The authors found that it answered appropriately "about 88 percent of the time", however, in one case (for example), it gave advice that had become outdated about a year earlier. The comprehensiveness of its answers was also lacking. A study published in JAMA Internal Medicine that same month found that ChatGPT often outperformed human doctors at answering patient questions (when measured against questions and answers found at /r/AskDocs, a forum on Reddit where moderators validate the medical credentials of professionals; the study acknowledges the source as a limitation). The study authors suggest that the tool could be integrated with medical systems to help doctors draft responses to patient questions.

Professionals have emphasized ChatGPT's limitations in providing medical assistance. In correspondence to The Lancet Infectious Diseases, three antimicrobial experts wrote that "the largest barriers to the implementation of ChatGPT in clinical practice are deficits in situational awareness, inference, and consistency. These shortcomings could endanger patient safety." Physician's Weekly, though also discussing the potential use of ChatGPT in medical contexts (e.g. "as a digital assistant to physicians by performing various administrative functions like gathering patient record information or categorizing patient data by family history, symptoms, lab results, possible allergies, et cetera"), warned that the AI might sometimes provide fabricated or biased information. One radiologist warned: "We've seen in our experience that ChatGPT sometimes makes up fake journal articles or health consortiums to support its claims"; As reported in one Mayo Clinic Proceedings: Digital Health paper, ChatGPT may do this for as much as 69% of its cited medical references. The researchers emphasized that while many of its references were fabricated, those that were appeared "deceptively real". As Dr. Stephen Hughes mentioned for The Conversation however, ChatGPT is capable of learning to correct its past mistakes. He also noted the AI's "prudishness" regarding sexual health topics.

Contrary to previous findings, ChatGPT responses to anesthesia-related questions were more accurate, succinct, and descriptive compared to Bard's. Bard exhibited 30.3% error in response as compared to ChatGPT (0% error). At a conference of the American Society of Health-System Pharmacists in December 2023, researchers at Long Island University (LIU) presented a study that researched ChatGPT's responses to 45 frequently asked questions of LIU College of Pharmacy's drug information service during a 16-month period from 2022 to 2023 as compared with researched responses provided by professional pharmacists. For 29 of the 39 questions for which there was sufficient medical literature for a data-driven response, ChatGPT failed to provide a direct answer or provided a wrong or incomplete answer (and in some cases, if acted upon, the answer would endanger the patient's health). The researchers had asked ChatGPT to provide medical research citations for all its answers, but it did so for only eight, and all eight included at least one fabricated (fake) citation.

A January 2024 study conducted by researchers at Cohen Children's Medical Center found that GPT-4 had an accuracy rate of 17% when diagnosing pediatric medical cases.

Law
In January 2023, Massachusetts State Senator Barry Finegold and State Representative Josh S. Cutler proposed a bill partially written by ChatGPT, "An Act drafted with the help of ChatGPT to regulate generative artificial intelligence models like ChatGPT",  which would require companies to disclose their algorithms and data collection practices to the office of the State Attorney General, arrange regular risk assessments, and contribute to the prevention of plagiarism. The bill was officially presented during a hearing on July 13.

On April 11, 2023, a judge of a session court in Pakistan used ChatGPT to decide the bail of a 13-year-old accused in a matter. The court quoted the use of ChatGPT assistance in its verdict:

The AI language model replied:

The judge asked ChatGPT other questions about the case and formulated his final decision in light of its answers.

In Mata v. Avianca, Inc., 22-cv-1461 (PKC), a personal injury lawsuit against Avianca Airlines filed in the Southern New York U.S. District Court in May 2023 (with Senior Judge P. Kevin Castel presiding), the plaintiff's attorneys reportedly used ChatGPT to generate a legal motion. ChatGPT generated numerous fictitious legal cases involving fictitious airlines with fabricated quotations and internal citations in the legal motion. Castel noted numerous inconsistencies in the opinion summaries, and called one of the cases' legal analysis "gibberish". The plaintiff's attorneys faced potential judicial sanction and disbarment for filing the motion and presenting the fictitious legal decisions ChatGPT generated as authentic. The case was dismissed and the attorneys were fined $5,000.

In October 2023, the council of Porto Alegre, Brazil, unanimously approved a local ordinance proposed by councilman Ramiro Rosário that would exempt residents from needing to pay for the replacement of stolen water consumption meters; the bill went into effect on November 23. On November 29, Rosário revealed that the bill had been entirely written by ChatGPT, and that he had presented it to the rest of the council without making any changes or disclosing the chatbot's involvement. The city's council president, Hamilton Sossmeier, initially criticized Rosário's initiative, saying it could represent "a dangerous precedent", but later said he "changed his mind": "unfortunately or fortunately, this is going to be a trend."