Module talk:Tabular data

Great
this is great! Now we just need easier ways to edit tables, like Vera's work. Let me try this with this data. – SJ + 12:41, 10 May 2020 (UTC)


 * I didn't realize it during the hackathon, but T251759 is already well underway. Can't wait to see it live! – Minh Nguyễn &#x1f4ac; 19:38, 10 May 2020 (UTC)


 * hot dog! thanks for the link :) and the illustrative attempt here too, good practice to parse. – SJ + 00:26, 11 May 2020 (UTC)

Multiple fields
what would be nice (and significantly help performance in some cases) is the ability to get multiple fields. Like how Module:Covid19Data is called on User:EProdromou (WMF)/COVID-19 case data as.

For example:

Not sure how difficult this would be. - Alexis Jazz 06:11, 19 May 2020 (UTC)


 * Thanks for the idea! It would certainly be feasible, but if efficiency or tidiness is the primary consideration, then I think it would be even better to refine or Json2table to allow for the desired format on each row or create a separate function that outputs the whole table in list form. Or were you thinking of a use case where each row would come from a different Commons data table? – Minh Nguyễn  &#x1f4ac; 10:19, 20 May 2020 (UTC)
 * I was actually thinking of a case where two (or more) fields from the same row are needed. A template like this one has to do two lookups for the same row, only to return a different field each time. This increases the page preview/rendering time and load on the Wikimedia servers. - Alexis Jazz 03:11, 21 May 2020 (UTC)


 * ✅, although I'd expect that large or complex tables or lists would be better served by a custom Lua function that interacts with the tabular data directly, since that also affords more control over formatting and allows lookups to be reused. – Minh Nguyễn &#x1f4ac; 05:14, 21 May 2020 (UTC)
 * Thanks, I also updated Tabular query. I've noticed though that seemingly took almost a second to preview after I updated the module where it was about half a second before. This was before I updated that template to use the new functionality. Now that I've updated it, it's back taking half a second where I was hoping to get the preview/rendering time down to about 0.3 seconds. - Alexis Jazz 06:31, 21 May 2020 (UTC)
 * I see the performance has increased, down to about 0.4 seconds now. - Alexis Jazz 08:47, 1 June 2020 (UTC)

Performance
Recently developed  which uses c:Data:Wikipedia statistics/data.tab generated by GreenC bot. One of the issues we ran into was performance, because each time the template is invoked, the Commons file is retrieved via mw.ext.data.get which is slow. List of Wikipedias had over 4,000 invocations which exceeded Lua's 10 second time and rendered red errors. suggested a solution to load the Commons file 1 time per page but mw.ext.data.get does not support this, however mw.loadData does. So the mw.ext.data.get is used in which is then loaded by mw.loadData in. It works to ensure the file from Commons is loaded 1 time regardless of how often the module is invoked on a page. Is this an issue with this module? Should we recommend readers to use vs. this template, since it is being used as an example? -- Green  C  02:26, 24 May 2020 (UTC)
 * Module:NUMBEROF/data is easily able to provide a cache of the Commons data because the module was written specifically for that data format, and with knowledge of what was wanted by the main module. To do that more generally would be tricky. Using hundreds of calls to Module:Tabular data would consume a lot of resources. If that is ever required, I would think a workable solution would require a custom module like Module:NUMBEROF/data. Re the question: yes, NUMBEROF should be used although I suppose the example in the docs here was intended to show this module's flexibility. I think the docs should include a "but see NUMBEROF" note. Johnuniq (talk) 03:43, 24 May 2020 (UTC)


 * This module is intended to serve a variety of use cases generically, so it's different than NUMBEROF, but I added a "See also" link to that template, just in case. This module provides for situations where the entire table is needed on a given page, as opposed to a lookup of a few values. That function could be made more flexible, along the lines of Json2table, but I think ultimately any use case that requires looking up a lot of values from the same Commons table and including the results on the same page warrants a dedicated Lua module to build that entire portion of the page. Then caching wouldn't be so relevant, because the Commons table would only get loaded once anyways. – Minh Nguyễn  &#x1f4ac; 22:44, 25 May 2020 (UTC)

Getting row or column data
Just an idea, not sure about the technical feasibility. Similar to getting cell value, is it possible to get the column values or row values? Output shall be csv(or some delimited values) in place of a value. One of the usage I'm looking for is to use in Graph:Chart as data series.- Timbaaa -> ping me 13:31, 14 July 2020 (UTC)


 * That's definitely feasible, though it might be easier to integrate something with Graph:Lines, which is already pretty usable with tabular data, as seen in COVID-19 pandemic in the San Francisco Bay Area. It would look pretty similar to the existing  function, but just the part that collects the  s of the elements in  . If you're planning to use this functionality inside a module instead of directly inside a template or article, I'd suggest working with   directly so you have maximum control over formatting. – Minh Nguyễn  &#x1f4ac; 19:46, 19 September 2020 (UTC)

Search as Number
Great job!

For some reason, it doesn't work for me. For example, a request like this:

returns an empty string instead of "Ajdovščina".

Help me please. — Preceding unsigned comment added by Игорь Темиров (talk • contribs) 19:14, 8 November 2020 (UTC)


 * There's no 261 in the cases column of c:Data:COVID-19 Slovenia cases per capita.tab. The cases value for Ajdovščina is 2204. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 01:05, 24 June 2021 (UTC)

Search more than 1 column
Would it be possible to make it search two (or more) columns? I.e.: E.g.:

— 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 01:00, 24 June 2021 (UTC)


 * I've created Module:Tabular_data/sandbox with a function to try and handle the second search requirement. It doesn't work. However, I can't get the existing module to return data from c:Data:UN:Total population, both sexes combined.tab.


 * What am I missing? Could it be the page name with a colon that is invalid? —  Jts1882 &#124; talk 13:41, 30 June 2021 (UTC)
 * There were a few issues:
 * The page name is a numbered parameter so should be trimmed. The sandbox does this now and the non-sandbox example above is edited to remove white space and linefeeds.
 * The search comparisons assume string values. The population data has numbers so these need to be converted before comparing (as done in the sandbox) or the module modified to use the types (more involved).
 * Anyway, this shows how the data in the data page at commons can be retrieved. The two examples (Afghanistan 1950 and Zambia 2020) get the right numbers. —  Jts1882 &#124; talk 16:32, 30 June 2021 (UTC)
 * Great! Thanks,.
 * I wrapped it (that particular application) in a template, and started testing it out here. It works, but runs out of time limit pretty quickly. Could it be made more efficient? — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 02:46, 1 July 2021 (UTC)
 * You probably don't need ustring (do you?), and string will do the trick, but I don't see that making a big difference there.
 * One option is specifying the columns by number so the module doesn't have to search for them by name. Inconvenient, but... faster. (Then, again, with only 3 columns on that table, I don't see it making much of a difference). — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 02:55, 1 July 2021 (UTC)
 * Oh, I see, you have to pull down the whole table at every call: local data = args.data or mw.ext.data.get(page) Yeah, it ain't small (it's at the 2MB limit). What's the alternative; slicing it into a different table for each year? Any batch proc for doing that? — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 07:40, 1 July 2021 (UTC)
 * That was a concern for memory usage but I think that is cached. I did a few tests on a blank page and the memory usage didn't increase dramatically when calling the template multiple times. It's clearly processing time which goes up with the number of calls. There is no noticeable difference between Afghanistan and Zambia so the looping is fast (as expected). It's still possible  is responsible, even if not loading each time, as dealing with the cache might take time. It needs some more tests.
 * Incidentally your template seems to have considerable overhead, both doubling the time and causing a high expansion depth. Using invoke gets the time down to just over 100ms each call. —  Jts1882 &#124; talk 08:00, 1 July 2021 (UTC)
 * I've created a function that does the bare minimum (, line 208). It doesn't reduce the time substantially (100-150ms depending on run). So I think this sort of template can only be used safely about 50 times on a page.
 * For generating tables, the alternative is a module that takes a list of countries like Module:Country population. A lot more work, but you can get it to do exactly what is needed with appropriate options. —  Jts1882 &#124; talk 09:05, 1 July 2021 (UTC)
 * Incidentally your template seems to have considerable overhead, both doubling the time and causing a high expansion depth. Using invoke gets the time down to just over 100ms each call.
 * Can you pinpoint what it is? I invoke the module once if the given date parameter is a year, and twice to interpolate values for a specific date (and once again before those to verify that the country is on the table). What do you reckon needs to be done to reduce the overhead, ? — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 08:30, 3 July 2021 (UTC)
 * It looks like it's invoked twice if given date parameter is a year and three times if a specific date. It's invoked once for the test and then once or twice for the output. Is the test necessary? I've removed it in the template and the output in the documentation is the same, but takes less processing time (a bit more than half). —  Jts1882 &#124; talk 09:01, 3 July 2021 (UTC)
 * I saw that a large chunk of the transclusion expansion time was from density, so I replaced the call to convert, which in turn invokes a very general and flexible module, by a simple calc only for km$2$ and sqmi. Density is still at 77% of transclusion expansion time (not sure how those pcts add up, with UN pop at 65%), and Draft:List of countries and dependencies by population density is still timing out Lua (it seems) at the 14th table row (Jersey). I see that 93% of the Lua time is consumed by Scribunto_LuaSandboxCallback::get -- what's that? Cheers. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 12:24, 3 July 2021 (UTC)
 * I assume that is the function behind.
 * There is something odd in Draft:List of countries and dependencies by population density. While it starts giving timeout error in the 14th line, other lines display without errors down to line 59. What makes those lines avoid the timeout? —  Jts1882 &#124; talk 13:46, 3 July 2021 (UTC)
 * Yeah, I noticed that. Bloody good question. I sort of assumed the invoke queue doesn't follow the order in the code. — 𝐆𝐮𝐚𝐫𝐚𝐩𝐢𝐫𝐚𝐧𝐠𝐚 ☎ 14:23, 3 July 2021 (UTC)

Null value and output format error
If the lookup points to a null value cell and there is an output format, it gives an error. Could you fix this, please? Bean49 (talk) 20:05, 23 January 2024 (UTC)

Could be more output columns and only one null. Bean49 (talk) 20:24, 23 January 2024 (UTC)