Talk:AI safety

	This article was reviewed by member(s) of WikiProject Articles for creation. The project works to allow users to contribute quality articles and media files to the encyclopedia and track their progress as they are developed. To participate, please visit the project page for more information.Articles for creationWikipedia:WikiProject Articles for creationTemplate:WikiProject Articles for creationAfC articles
	This article was accepted from this draft on 20 January 2023 by reviewer UtherSRG (talk · contribs).

Artificial Intelligence

This article is within the scope of WikiProject Artificial Intelligence, a collaborative effort to improve the coverage of Artificial intelligence on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Artificial IntelligenceWikipedia:WikiProject Artificial IntelligenceTemplate:WikiProject Artificial IntelligenceArtificial Intelligence articles

The following Wikipedia contributor may be personally or professionally connected to the subject of this article. Relevant policies and guidelines may include conflict of interest, autobiography, and neutral point of view.

Joshuaclymer (talk · contribs)

Single-purpose account[edit]

SPA User:Joshuaclymer wrote this page to have the same section headings as [1]. Could be astroturfing. 5.151.106.5 (talk) 17:15, 14 July 2023 (UTC)[reply]

COI tag (July 2023)[edit]

A little bit of search indicates probable cause for significant COI through potential association of article creator with the Center for AI Safety. Graywalls (talk) 02:02, 27 July 2023 (UTC)[reply]

I wrote the initial version of the article seeing that, surprisingly, there was no AI safety article on wikipedia, despite it being a globally relevant topic today. I created a Wikipedia account to make this article and used my full name for transparency.It seems there are some key misunderstandings worth clarifying.

I am an independent AI risk researcher who was not paid by anyone to write the article. My funding is from a non-AI fellowship. I received funding from the Open Philanthropy Century Fellowship [2] to have autonomy to work on whatever I choose for two years, starting last year. During this time, I have volunteered and collaborated with AI risk organizations and studied AI safety topics.

Without doing AI risk research or gaining experience from a variety of AI risk perspectives, I don’t think the article could be written informatively or in an un-confused way. Before posting the initial version of the article I collaborated with and received feedback from other individuals, such as User:SoerenMind, who helped author AI alignment.For the section headings of the purely technical AI risk research areas, I used a standard three-way decomposition (alignment, monitoring, robustness). Google DeepMind used “specification, assurance, robustness”

[3]; specification is another word for “alignment” which is much more common. Likewise, Anthropic uses “steerable, interpretable, and robust” [4].

Feel free to change “alignment” to “specification” if you’d prefer to imitate the DeepMind breakdown, for example. Joshuaclymer (talk) 17:59, 3 August 2023 (UTC)[reply]

The article's table of contents seems inspired by the work of Dan Hendrycks, a researcher who has published some articles presenting broad overviews of AI safety, and who leads and cofounded in 2022 the Center for AI Safety. The content of the paragraphs on the other side seems inspired from diverse academic sources, and the article is not promoting the Center for AI Safety. The article (and its [original version] written by Joshuaclymer) looks pretty neutral to me.

What do you think, @Graywalls: ? Alenoach (talk) 21:35, 26 August 2023 (UTC)[reply]

Nice catch @Graywalls:. There does appear to be an overlap between @Joshuaclymer:'s November 2022 editing and their role at the Center for AI Safety. I thank them for their general transparency.

I agree that Wikipedia expects such formal connections to be more clearly disclosed upfront. In this case, the article has received substantial attention from a wide variety of other editors over approximately the past year. It may be appropriate to move relevant maintenance templates to the talk page (see Template:Connected contributor) in order to avoid clutter and misleading readers. WeyerStudentOfAgrippa (talk) 14:26, 10 November 2023 (UTC)[reply]

Wiki Education assignment: Research Process and Methodology - SU23 - Sect 200 - Thu[edit]

This article was the subject of a Wiki Education Foundation-supported course assignment, between 24 May 2023 and 10 August 2023. Further details are available on the course page. Student editor(s): York1210 (article contribs).

— Assignment last updated by York1210 (talk) 17:49, 29 July 2023 (UTC)[reply]

AI and workplace safety[edit]

I propose to move the recently added subsection "Hazards to workers" to another article, perhaps the article workplace wellness or occupational safety and health. Although this subsection is well written and is relative to AI and to safety, it's not really a major focus of the field of AI safety, as highlighted by the fact that the references don't even mention "AI safety" as far as I can access them. AI safety is more like a theoretical research field, not really about concrete implementations of safety practices in the every-day environment (see the section "Research focus"), and this subsection could confuse readers as to what AI safety and the rest of the article is about. Alenoach (talk) 23:20, 3 May 2024 (UTC)[reply]

@Alenoach: That's actually very interesting. Is there another Wikipedia article focusing on practical implementations of safety in AI? I'm not sure that there's no place in this article for the practical side of AI safety, as opposed to the purely theoretical side, as long we can find proper balance. Research and practice should be increasingly linked as more AI applications are deployed, and people looking at an AI safety article are going to want to learn about how AI affects their safety in a real-world way. John P. Sadowski (NIOSH) (talk) 21:46, 4 May 2024 (UTC)[reply]

You made a good point. Perhaps we could have a section "Workplace safety" which would be an excerpt from Occupational_safety_and_health#Artificial_intelligence? Although I would put it rather at the end of the article, we could have a sentence or two in the section "Motivations" about this. Do you think that would be good? Alenoach (talk) 06:39, 5 May 2024 (UTC)[reply]