Primož Jakopin

Primož Jakopin (pron. Premozh Yacopeen), born 30 June 1949 is a Slovenian computer scientist, known for his work in the field of language technology and his contribution to speleology.

Early life and education
Jakopin was born in 1949 in Ljubljana, Slovenia. The family lived in the village of Leskovec pri Krškem, Dolenjska region and in 1956 moved to Ljubljana.

After a degree in technical mathematics (Numerično računanje singularnih integralov / Numerical Computation of Singular Integrals) at the University of Ljubljana in 1972, he obtained a master's degree in information sciences with the thesis Entropija imena i prezimena u Sloveniji / On entropy of first names and last names in Slovenia at the University of Zagreb in 1981 and in 1999 a Ph.D. with the thesis Zgornja meja entropije pri leposlovnih besedilih v slovenskem jeziku / Upper Bound of Entropy in Slovenian Literary Texts, again at the University of Ljubljana.

Computational linguistics
He was a senior lecturer at the Department of Comparative and General Linguistics, Faculty of Arts, University of Ljubljana. His subjects of instruction are language technologies with stress on Lemmatisation. From 2001 to 2012 he was the Head of the Corpus Laboratory at the Fran Ramovš Institute of Slovenian Language (within the Scientific Research Centre of the Slovenian Academy of Sciences and Arts). He participated in a number of European projects on language resources.

His major pieces of software: IBIS for the Digital DEC 10 mainframe computer, 1981, INES for the ZX Spectrum microcomputer, 1985, STEVE (ATARI ST, 1987-1992), EVA for DOS, 1992- and Microsoft Windows family of operating systems, 1996-, NEVA - Windows server search engine, 1999-. From 1992 to 1994 he supervised the transfer of the Standard Slovenian Dictionary (SSKJ) from printed to electronic version (EVA OCR, DOS version). In 1997 he wrote the first part-of-speech tagger for Slovenian texts. In 1999 he started an Internet text corpus, with a concordance service and linked wordform and reversed wordform frequency dictionaries. It is available as Nova beseda (New word).

Speleology


In high school, he read the book Kraški svet in njegovi pojavi / Karst world and its phenomena by Pavel Kunaver and especially because of its photographs by Bogumil Brinšek, Jakopin became interested in speleology. In 1966 he joined Ljubljana Cave Exploration Society (DZRJL) to learn more about caves and to participate in exploration of new caves. As a mathematician he was particularly interested in the principal cave size parameters, calculated from the cave survey, length and depth. They are closely related to the definition of a cave as a hollow underground formation, large enough for human exploration,  and are used to compare and classify caves, for instance in the List of longest caves and in the List of deepest caves. Whereas the cave depth is well defined as the difference between the highest point and the lowest point of the cave, its length, usually given as the sum of distances between the survey stations on the cave floor, involves considerable arbitrariness. In cave science it was long known that the cave length does not represent the cave size properly as caves are 3D objects with volume as their main and most noticeable feature, yet volume was less used because of the lack of a suitable measurement method. In 1972, at the 6th Yugoslavian congress of speleology, Jakopin proposed volume as the main cave parameter in the paper O numeričnem vrednotenju kraških objektov ("Numerical Valuation of Objects on Karst"), measured from a computer-based cave model.

In 1974 he made a 3D model for approximation of cave space, based on a series of connected polygonal cross sections and used it to make a 3D survey of Skednena jama cave, a fossil ponor at the northern rim of the Planinsko polje karst field in Slovenia. By 1979 he developed a computer program to support the 3D model, to compute its vertex coordinates and to calculate the model (and cave) parameters: length, surface and volume. In 1981 Jakopin published the results for Skednena jama cave in the paper Macrostereological Evaluation of Cave Space. The model had 305 vertices, 51 cross sections, total length was 205 m, surface area 8900 m2 and volume 6,500 m3 with error estimated at below 5%. Early in 1981 a larger cave, Mačkovica, located in the same area as Skednena jama cave, was surveyed. Here the model consisted of 106 cross sections and 709 vertices for a length of 650 m and volume of 38,800 m3. Volume error was estimated at below 2%.

After microcomputers became widely available in the early 1980s, Jakopin developed a different method of volume calculation which could be performed with much less computing power. Instead of computing cave model segment parameters by cutting it iteratively into ever thinner parallel slices it was based on breaking up every segment into a series of tetrahedrons, the parameters of which can be computed directly. He implemented the updated model on a Sinclair ZX Spectrum personal home computer. In 1982 a team, led by Jakopin, made the survey of a 450 m long section of Postojnska jama cave, from the Concert Hall to the end of the Great Mountain, and the model yielded a volume of 313,000 m3 In 1984 Daniel Rojšek and his team measured a 54 m long segment of the Martel's Hall at the end of Škocjanske jame caves with a volume of 220,000 m3, one tenth of the entire hall volume (2,200,000 m3), computed in 2018. It is comparable in size to other large underground chambers.

In 2019 and in 2020 Jakopin wrote articles about three people who devoted most of their life to deep caves, to achieve depths greater than 2,000 meters: Pavel Demidov in Verëvkina Cave, Jurij Kasjan in Voronja Cave and Aleksandr Višnjevskij in Boybuloq.



Family
His father Franc Jakopin was a Slovenian slavist, lexicographer and onomatologist, his mother Gitica Jakopin was a translator, writer and a poet, his brother Japec Jakopin is a yacht concept designer and his brother Jernej Jakopin is a naval architect.

Publications

 * CORTES - a text corpus of Slovenian. In publication: Digital resources for the humanities: Conference abstracts (University of Sheffield, 10–13 September 2000). - Sheffield: University of Sheffield, 2000. - p. 70-72.
 * EVA - an Internet tool for textual and lexical resources. In publication: Linguistics and language studies / 32nd Annual Meeting, Ljubljana, 8–11 July 1999. - Ljubljana: University, Faculty of Arts: Societas Linguistica Europaea, 1999. - p. 98.
 * The feasibility of a complete text corpus. LREC 2002: proceedings.
 * On text corpora, word lengths, and word frequencies in Slovenian. In publication: Contributions to the science of text and language / edited by Peter Grzybek. - Dordrecht: Springer, 2006. (Text, speech and language technology; vol. 31). - ISBN 1-4020-4067-9. - p. 171-185.
 * Query-driven dictionary enhancement. Co-author: Birte Lönneker. In publication: Proceedings of the Eleventh EURALEX International Congress, EURALEX 2004, Lorient, France, July 6–10, 2004 / Geoffrey Williams and Sandra Vessier (eds.). - Lorient: Université de Bretagne-Sud, cop. 2004-. - p. 273-284.
 * Slovenian texts on the internet. In publication: Zapiski: Chronicle of the American Slovene Congress. Issue 7 (May 2000), p. 4-7.
 * Words and nonwords as basic units of a newspaper text corpus. In publication: COMPLEX 2001 / 6th Conference on Computational Lexicography and Corpus Research "Computational Lexicography and New EU Languages", Mason Hall, Birmingham, 28 June-1 July 2001. - Birmingham: Centre for Corpus Linguistics, Department of English, University of Birmingham, 2001. - p. 49-65.
 * Entropija v slovenskih leposlovnih besedilih (Upper Bound of Entropy in Slovenian Literary Texts), Založba ZRC, Ljubljana 2002.
 * O oblikoslovnem označevanju slovenskega besedila (Morphological tagging of Slovenian texts) (co-author A. Bizjak), Slavistična revija 1997.
 * Odzadnji slovar slovenskega jezika (Inverse Dictionary of Slovenian language) (co-author M. Hajnšek-Holz), Ljubljana 1996.