Wikipedia:Reference desk/Archives/Computing/2021 October 28

= October 28 =

How can I see only the latest year for each country in a spreadsheet?
Hi. Please see the country table here:
 * List of countries by intentional homicide rate

The source is here: Click the "bulk data download" link (name shows up when hovering over it) at the top right a couple times to download the file named:
 * https://dataunodc.un.org/content/homicide-rate-option-2
 * homicide_total_rate_and_count.xls

For this homicide rate spreadsheet the latest year varies by country. I am wondering how I can quickly get the spreadsheet down to showing only the data for the latest year.

I wrote this section:
 * Help:Table

But I have limited knowledge of how to use spreadsheets. I would like to know how to get the latest year in LibreOffice Calc since I have it installed, and it is freeware.

Once I have the spreadsheet pared down to the latest year by country, then it is easy to copy it to the Wikipedia article in order to update it. --Timeshifter (talk) 21:35, 28 October 2021 (UTC)


 * There's probably a bunch of ways to do this, but here's the steps I would take:


 * 1) Import the CSV into the spreadsheet. I assume Libre does this pretty much automatically; Excel certainly does.
 * 2) Define the data set as a table (in Excel, CTRL+T). This is optional, but will make later steps easier.
 * 3) Apply filters (in Excel CTRL+SHIFT+L).
 * 4) If you want rates, filter to show counts and delete those rows; if you want counts, filter to show rates and delete.
 * 5) You now have a clean list of data. Sort the year column, highest to lowest.
 * 6) Here's the tricky part and you may want to do it across multiple columns, but it's possible to do it in one:
 * 7) Do a VLOOKUP on the country to look up the year. Since VLOOKUP only returns the first value, it will return the last year.
 * 8) Perform a check to see if the year you VLOOKUP'ed is the year of that row. If it is, you've got the last year.
 * 9) The formula I used was =IF([@Year]=VLOOKUP([@Country],Table1Country]:[Year,3,FALSE),"last year","")
 * 10) I then copy-pasted values and used filtering to get rid of the extra lines.

Sorry to keep using Excel jargon, but many of the shortcuts and formula commands will be the same across many programs. Like, I'm pretty sure these steps would work for Google Sheets (also free). A couple of further notes: the rate values are currently in percentage points, like 5.9% is showing as 5.9, so you'd have to add in the % marks. Converting from spreadsheet into table will require another step, but there are free converters out there (for Excel, at least, and I assume they'd work for Libre). Matt Deres (talk) 16:57, 29 October 2021 (UTC)


 * @Timeshifter A bit more: you'll need to do further cleaning on the list. For example, some countries are on there multiple times due to focus on different regions and so on. There are four lines for the UK, 3 for China, 3 for Iraq, etc. In some cases that may be perfectly fine and expected, but you'll want to double-check them. You may also want to edit the names for brevity (ex: Iran, not "Iran (Islamic Republic of)" and so on). Matt Deres (talk) 17:16, 29 October 2021 (UTC)
 * . Thanks. There are various flag templates that wrap around the long country names, and only show the linked shortened country names. Adding those flag templates is easy and fast. See this section and the following section of Help:Table:
 * Help:Table
 * And converting the finished spreadsheet table to a wikitable is easy. I wrote the Visual Editor section of Help:Table:
 * Help:Table
 * But my knowledge of spreadsheets is very limited, and I basically only know about LibreOffice Calc. What I know is pretty much described in that Help:Table section on VE, and some other sections.
 * So all I understand of what you wrote is up to the part of opening homicide_total_rate_and_count.xls (which Calc does automatically) and sorting by year.
 * I want both rates and counts. I see that the xls file has all the data needed to fill in the columns here:
 * List of countries by intentional homicide rate
 * Could you paste the Excel table you have so far into tab2wiki and copy the wikitext result to a user sandbox? From Help:Table:
 * Visual Editor will load very fast in empty sandboxes: Special:MyPage/Sandbox, Special:MyPage/Sandbox2, Special:MyPage/Sandbox3. As many as you want. ... To find all your sandboxes: Special:PrefixIndex/User: – click link, add user name to the spot labeled "Display pages with prefix:".
 * And if you have the time, energy, and patience could you install LibreOffice Calc and see if it can do what Excel can do. It installs fast. See download info page. I always install the early adopter version, and its help file.
 * --Timeshifter (talk) 17:11, 30 October 2021 (UTC)


 * I'm not going to install a new program just because you don't feel like trying to follow instructions. :) The data set is below in table format. Luckily the sources for both the rate and count were the same, so we didn't need to double up those. Matt Deres (talk) 14:47, 2 November 2021 (UTC)


 * . Thanks for the table of data. I used it to update the Wikipedia article. For more info see:
 * User:Timeshifter/Sandbox165
 * I don't have Excel installed on my PC. I don't want to spend the money. Today I tried using the free version of Excel on the web for the first time. About all I could figure out was how to sort a column in descending or ascending order. As I said before I am a newb at spreadsheets, and only know some very basic stuff I use to edit tables on Wikipedia.
 * So I would need instructions that tell me exactly what to do with the Excel on the web. No missing steps. I would put those instructions in Help:Table. There are a lot of country, state, and province tables on Wikipedia and the web that use sources with multiple years. So it would greatly help with getting better, and more frequently updated, tables on Wikipedia. Many other newbs would help if they knew how to help. :) --Timeshifter (talk) 23:32, 2 November 2021 (UTC)