Wikipedia:Reference desk/Archives/Computing/2022 November 29

= November 29 =

On the page URL shortening, what is meant by "as the service delivery time increases, the length of the URL will also increase"?
There's an example given, but the example is just a shortening of the URL for the history of the article. I can't see what it's driving at. Card Zero (talk) 00:39, 29 November 2022 (UTC)


 * Me neither, it makes no sense; I've removed the paragraph. Thanks for spotting it. Elizium23 (talk) 00:43, 29 November 2022 (UTC)
 * In fact the history of that diff is stranger than fiction! Elizium23 (talk) 00:44, 29 November 2022 (UTC)
 * It's very poorly worded that's for sure and should be removed without a citation but I think what it's referring to is that any URL shortening service is eventually (theoretically at least) going to run out of codes of a certain length based on whatever charset they are using. They could increase the character set eg of it's alphanumeric but lower case only they could potentially add upper case provided they enforced case sensitivity at the beginning (if they didn't then it's quite likely they will break many existing links) or add symbols. But eventually even that will run out, especially since while emojis and other characters e.g. CJK are an option they probably interfere with usability too much to be worth it. Even symbols might be seen the same. So either the service needs to be shut down for new links, add a new different URL (effectively changing one character in the part which did not change previously), or add an extra character. As a practical matter, even case insensitive alphanumeric can quickly get very large, so depending on the starting length it may be that this limit is never ran into. From URLShortener, I gather the Wikimedia URL shortener service started at 1 and perhaps just keeps increasing as needed (alternatively few characters requires some special permission and is reserved for especially important URLs) and was up to 4 at the time the Q28 added that detail but may or may not eventually hit 5 characters. Nil Einne (talk) 13:45, 29 November 2022 (UTC)
 * That isn't necessarily so. Links could expire on a timed basis, be reaped when they go 404, deleted manually by the owners. There are ways to recycle the namespace. Also, if the length is chosen wisely to begin with, the universe will die off before the link shortener does. YouTube has kabillions of videos and their URL scheme hasn't been lengthened. Elizium23 (talk) 13:48, 29 November 2022 (UTC)
 * While timed links are a possibility, most services are based on the assumption the short URL basically lasts as long as the service lasts. Unless the links are timed, recycling intentionally isn't practiced and definitely generally isn't introduced later since as with case-sensitivity, introducing such a change mid way is a terrible idea as it breaks existing links when people did not expect it to. For starters, there is no way to be sure a link is permadead since even if the domain has been taken over by a hijacker, it's always possible in 10 years time, someone may recover the domain either by paying the hijacker or because the hijacker has given up, and reintroduce the service. And putting that aside, some people want to know what the URL was even if it's dead e.g. to look for it on a web archive. They don't want to wonder why granny sent them a porno link 10 years ago when in reality she sent them a link to some Catholic website it's just that the shortened URL was re-used. (Some people may get confused by redirections from dead URLs but people who know what they're doing can figure out where the URL pointed to and therefore whether granny really sent them a porno link 10 years ago.) Of course, a service could have a way to check a shortened URL's history or at least when the current redirection was added, but ultimately the point is that people generally expect a shorted URL to just work even if it's been a long time, and if they do end up somewhere odd, it's on the destination end not the shortened end. In other words, if I click on a URL even a very long time, it will still take me to wherever the person was trying to send me to, even if that URL may be dead or may redirect me perhaps semi transparently. The most you generally expect to happen beyond the service just dying itself, is that the redirection may be disabled for security or ToS reasons in which case you may not know what the URL pointed to, but at least it doesn't seem like granny sent you a porno link due to the destination URL suddenly changing. Note in any case, recycling only prevents the problem completely if your rate of recycling is higher than new links. If it isn't, the problem may be delayed but will still occur at least theoretically, although as I said in my first reply depending on the rates involved it may or may not be a practical concern. (Sorry if I didn't make this clear but by theoretical I meant if we put aside other theoretical considerations like whether the internet in its current form let alone the universe will last that long and by practical I meant if we do introduce such considerations.) As for "wisely" the problem here is that there is an inherent contradiction between a URL shortening and choosing a long length. The youtu.be service chose to just keep the entirety of the video ID even if 11 characters means it's fairly long so you generally only reduce the length by half, but for services without such obvious reasons to chose such a long URL for your URL shortening, many do not chose such a long length instead often limiting it to 4 or 5 characters, in which case such concerns may move out of the realm of theory at some stage. Nil Einne (talk) 14:25, 29 November 2022 (UTC)
 * "As the service delivery time increases, the length of the finger will also increase" this just sounds like Google Translate screwing up. ― Blaze WolfTalkBlaze Wolf#6545 14:47, 29 November 2022 (UTC)
 * In the example "https://w.wiki/4ozb" (now removed), the path ("4ozb") consists of four characters. Assuming that only the 10 digits and the 26 lower-case letters are used, there are merely 1679616 four-character combinations. Assuming the host remains the same and all shortened URLs are unique, by the 1679617th time a shortened URL is requested, the service must return a longer path. When all five-character combinations are used up, it will have to start using six-character paths, and so on. --Lambiam 18:01, 29 November 2022 (UTC)
 * That makes the assumption that the URL is somehow encrypted, doesn't it? I wrote a URL shortener for an organization a while back. I used a hash table. So, I took a URL of any arbitrary length and I produced a 6-character code that could be used within the organization to access the original page. The hash table was resident on the local DNS and it converted internal requests to the proper URL. The point is that a 1 character URL (I know that doesn't exist) and a 255 character URL would both be converted to 6 characters. 12.116.29.106 (talk) 18:32, 29 November 2022 (UTC)
 * It does not use that assumption. The only assumptions are those explicitly stated. You cannot form 1679617 different four-character words using an alphabet with 36 characters. --Lambiam 22:58, 29 November 2022 (UTC)
 * First of all, you assumed incorrectly, because Wikimedia's URL shortener uses capital letters as well. That gives a space of 62 characters: 14,776,336 permutations in a 4-character URL.
 * I want to raise another issue that's very important: even if the length of the shortened URL grows as the service accumulates more redirects, its growth will decelerate markedly (logarithmically?) every time it grows, because the namespace grows vastly. So it's easy to rightsize a shortened URL; you start small - say 4 positions of 62 characters each, and then you add a position when you exhaust the namespace, and with 5 positions you have 916 million slots, then 56.8 billion, and then you can bide your time before you expand the URL beyond 6 positions. QED. Elizium23 (talk) 23:06, 29 November 2022 (UTC)
 * The (now deleted) statement in the article about the length of the URL increasing with the service delivery time was about URL shortening services in general, not specifically Wikimedia's URL shortener. And so was my comment. If the alphabet is finite, the length of the paths will keep growing as the number of uniquely assigned paths grows unboundedly. --Lambiam 10:29, 30 November 2022 (UTC)
 * But each character added to an a hash increases that hash's namespace exponentially. It is absurd for you to say that the number of uniquely assigned paths would grow "unboundedly" because we're on the finite Internet, and URL shortening services are a dime a dozen, competing with each other and restricted to various specialties. Can you say with a straight face that a normal shortening service, much less Wikimedia, barring malice or incompetence, will require more than 56.8 billion slots for URLs in the next 100 years? Elizium23 (talk) 13:34, 30 November 2022 (UTC)
 * There's finite and then there's "finite". Being finite is not the same as being exhaustible.  It doesn't take very many more digits to (for example) have a finite number of possibilities which is large enough to be actually inexhaustible, given that the universe has only so many atoms.  It is a well known bit of trivia that every well-shuffled deck of cards is expected to be a truly unique ordering; given that there are 8E67 possible decks of cards, while a finite number it's just stupid to think that you're going to hit the same ordering twice given that, even if it takes only 60 seconds to shuffle such a deck, that's still more minutes than has existed since the big bang.  Heck, it's more atoms than there are in the observable universe.  Similarly, it's rather trivial to create, with nothing more than the standard character set of the English QWERTY keyboard, a relatively small string of characters which is functionally inexhaustible, even on the "age of the known universe" time scale.  -- Jayron 32 14:23, 30 November 2022 (UTC)
 * The phrase that captures this phenomenon is "combinatorial explosion," and it's a phrase I use often, adding only a subtly-different explanatory suffix each time.
 * Nimur (talk) 01:25, 6 December 2022 (UTC)

Question about Microsoft Windows fonts
I noticed that some fonts have "W01" in the name, other fonts have "W03" and "W05". What does it mean? 2001:B07:6442:8903:D58A:6781:43DD:81FE (talk) 14:13, 29 November 2022 (UTC)
 * It is a standard notation for web fonts. See this chart of W indicators. 12.116.29.106 (talk) 14:33, 29 November 2022 (UTC)

Linux CLI Graphing Tool
Many years ago, I believe it was on Slackware, but may have been when I first started using Redhat, I used a program in Linux that took a text file and produced an SVG. The text file contained edge definitions like A->B and A->C. It would read in those definitions and produce a graph image (svg). It was command line. I wasn't using X at the time. I just added them into a latex file that I was rendering to PDF and opening on another computer. I've been searching and I can't figure out what that program was and if it still exists. 12.116.29.106 (talk) 18:20, 29 November 2022 (UTC)


 * Perhaps Graphviz. There are some examples, including the input text files, at this page. -- Finlay McWalter··–·Talk 22:53, 29 November 2022 (UTC)


 * The general topic is graph drawing; if Graphviz isn't what you're looking for, there's a list of other programs in that article which may help. -- Finlay McWalter··–·Talk 23:04, 29 November 2022 (UTC)