Talk:List of datasets for machine-learning research

Breaking into smaller lists
At what point do the lists become large enough to break off the lists into separate articles? — Preceding unsigned comment added by SeanBob123 (talk • contribs) 06:35, 23 January 2017 (UTC)
 * I don't think it's needed just yet - if the page gets unwieldy then we can break out DATAKEEPER    ✉    16:26, 21 September 2017 (UTC)

Comparison of datasets in machine learning being merged in
✅ Absolutely agree. DATAKEEPER   ✉    19:20, 26 February 2016 (UTC)


 * See also Talk:Comparison of datasets in machine learning for discussion. DATAKEEPER    ✉    20:07, 7 March 2016 (UTC)

Comparison of facial image datasets being merged in
✅ I agree with this article being merged in. DATAKEEPER   ✉    19:26, 26 February 2016 (UTC)
 * No objection or comment, performing merge and trimming non-notable datasets. Also trimming some primary sources and collapsing the schema to fit the one of this article. DATAKEEPER    ✉    22:38, 7 March 2016 (UTC)

Use of external links
There was a discussion on the external links noticeboard regarding the use of external links on this page. It was decided that they should not be included for the table entries. DATAKEEPER   ✉    21:37, 4 March 2016 (UTC)

Updating again Summer 2017 - have noticed some external links creeping in - please keep off of the list and make only internal links if the the datasets are noteworthy enough to have their own page. DATAKEEPER   ✉    20:38, 28 August 2017 (UTC)

Adding licenses
Distinguishing between datasets with restricted versus unrestricted licenses, particularly since they generally do not change, would be an added benefit to researchers. Are there any objections to adding a column for licenses? — Preceding unsigned comment added by JShenk (talk • contribs) 09:58, 15 October 2018 (UTC) ‎
 * I was coming here for the same issue.
 * I was thinking that the place to manage this list ought to be Wikidata, so that this list can be machine readable. I support anyone adding a column for licenses here but probably Wikidata is the place to sort this.  Blue Rasberry   (talk)  20:13, 15 February 2021 (UTC)
 * Support Geysirhead (talk) 19:30, 5 May 2023 (UTC)

Spliting out the image datasets?
What about spliting out image datasets including mixed ones with lidar and radar? Geysirhead (talk) 19:31, 5 May 2023 (UTC)


 * @Geysirhead 94.109.191.93 (talk) 06:52, 2 June 2023 (UTC)
 * Was it a "yes"? Geysirhead (talk) 18:31, 18 June 2023 (UTC)