User:EvanCarroll/U2 UniVerse Database Editorial

This is a quick overview of some of the features of U2 UniVerse that make it different.

The big problem
In the 70's there was a general rule that anything could be sacrificed for speed. This lead developers everywhere to believe that ease of use and separation of concerns had no value in design. UniVerse was born in this environment, and it shows today. The software brandishes its age in every way: its query language is intrinsically clumped together with the user interface, line-printer configuration, and output formats. Today, we don't have these issues which leaves modern technology separated, and easier to conquer. This is a benefit and one that you'll miss as you delve further into the black abyss of the U2 product line.

=
==============

Wow, you really have no idea what you are talking about Mr Evans. Universe has loads of old "features" that stem from its origins as a Pick relative, and as such it requires it keep much of this "old fashioned" architecture for compatability reasons. However it also has a whole host of new features that (while admittedly poorly documented) work very well and make Universe(and Unidata) fairly up to date in most areas. It might look backwards to you, but plenty of large, successful organisations with fairly advanced Universe based systems would disagree with your assessment. Its a shame this garbage comes up quite high on a Google search when looking for information on Universe, it could put some people off a very good Database system that is superior to Normalised systems in many ways.

=
==============

I agree with Mr. Evans. People won't be put off because most people who search for U2 UniVerse are the ones that are already frustrated by this dysfunctional database with bare minimum documentation available on Internet. Even Assembly language makes more sense than this database system. This database system has prevented our organization from moving forward and doing any advanced business intelligence because our system vendor does not want to move away from it. If this system is really so good then there's no reason why IBM would give up on it and sell it to Rocket. The reason why Google ranks this page so high is because you can't find any other useful and organized information regarding UniVerse, which renders this database system useless.

=
==============

It is sadly true that the UniVerse documentation is terrible and mostly aimed at people who already understand the database model, which is not surprising since it began its life as an emulator.

But once you do really understand it - and it means discarding many of those things you learned in that dull database 101 class you attended back at college and accept that Codd is not the final word in all things data - the flexibility and power it offers is simply unmatched by mainstream databases. It's transactionally more efficient, has a better locking model, and the fact that it embeds a fully capable business language for its stored procedures means that the rules stay close to the data and can be more easily shared between calling environments.

And it supports all the modern standards, APIs and language stacks, so it's hardly old fashioned. UniVerse is in fact younger than Oracle, SQL Server and many other familiar databases.

It's query language, RetrieVe, is just that - an enquiry language. For people who want to enquire on the data. So it presents that data in a useful way - like you might want to actually see it. If you want a more familiar SQL-like query language that doesn't, well it supports SQL so job done. And unlike other databases, you don't need to use the query language at all! If you step back and think about it, the whole idea of using queries as the main route to update a database is pretty stupid and inefficient, so with UniVerse you get the choice of a relational-like query driven interface and non-relational NoSQL-like direct primary key access, which coupled with the more complex data modelling makes it a very fast and low overhead option for transactional applications.

Oh and IBM didn't "give it up and sell it" - the entire division organized the sell off to get more autonomy away from DB2.

=
==============

Rocket U2 has Universe documentation online. Problem solved!

Files and tables are largely paralleled.

Non-1NF:
 * You can create virtual (non-physical) columns attached to a specific file using I-descriptors also know as I-types or interpretive types. These types can contain BASIC functions in UniBASIC, arithmetic string or logical expressions, and external file references using (TRANS).


 * Custom phrases can function to return more than one row, none of which have to be in the destination table. Phrases can be global.

All rows are implicitly hidden in UniVerse: '*' is a customizable SQL macro (stored in the dictionary under the @SELECT phrase) which expands to whatever list you want it to.

Third party interoperability is amusing because ODBC requires dynamic normalization to a flat 2d table.

Blurs formatting and conversions by including them in SQL metadata through Field Qualifiers:
 * FMT: Default output formatting. I feel this deserves to be quoted:

Here is an example taken from a table in the Retreive Guide for 10.3. MR2,D$*1 Allows two decimal positions, inserts a comma          23582.49 becomes 4       every three digits to the left of the decimal position, **$23,582.49db suffixes all positive values with db, prefaces the     right-justified output with a $, and right-justifies the entire result within a 12-character field with a left fill of        asterisks.
 * FMT will adversely effect SORT.
 * CONV: Default conversion of data, this would permit you to store something as a DATE, and convert it on output transparently. An example of this the Chinese year conversation set with the column statement . Awkwardly, the example given in the RetreiVe Guide for 10.3 is "sheep", which isn't even a Chinese year. Sheep is a mistranslation of "goat" the real name of the personality; but alas, that doesn't matter much because July 2, 1995 falls on the year of the boar.
 * You can chain them <tt>CONV expr CONV expr</tt>
 * <tt>DISPLAYNAME</tt> also <tt>COL.HDG</tt>: Default column reference, used often with UniVerse SQL interface. Other than being a syntactic synonymy this unfortunately has display elements too it other <tt>"Foo'L'Bar"</tt> will instruct all outputting mechanisms to split the header where the <tt>'L'</tt> is found. This can be overridden in <tt>LIST</tt> by using <tt>COL.HDG</tt>

UniBASIC
The official programming language of UniVerse is UniBASIC. The use of BASIC involves the following TCL commands:


 * <tt>BASIC</tt> (compiles the program)
 * <tt>CD</tt> (shorthand for <tt>COMPILE.DICT</tt>)
 * <tt>CATALOG</tt>
 * <tt>RAID</tt> (debugs)
 * <tt>FORMAT</tt> (pretty-print)
 * <tt>RUN</tt> (executes the program).

The string concatenation operator is the colon.

A BASIC source line can begin with a statement label. It always ends with a carriage return (Return). UniVerse BASIC uses the term newline to indicate the character or character sequence that defines where a line ends in a record in a type 1 or type 19 file. The newline differs according to the operating system you are using. [...] UniVerse BASIC handles this difference transparently in nearly every case...

Here are some lines from the guide. -- insert ref --


 * "In UniVerse BASIC source code, character string constants are a sequence of ASCII characters enclosed in single or double quotation marks, or backslashes ( \ ). "  (2-3)
 * "Dynamic arrays map the structure of UniVerse file records to character string data. Any character string can be a dynamic array. A dynamic array is a character string containing elements that are substrings separated by delimiters. At the highest level these elements are fields separated by field marks ( F ) (ASCII 254). Each field can contain values separated by value marks ( V ) (ASCII 253). Each value can contain subvalues separated by subvalue marks ( S ) (ASCII 252)." (2-10)
 * "Vectors are treated as matrices with a second dimension of 1. COST(35) and COST(35,1) are equivalent specifications and can be used interchangeably." (2-10)
 * "Conceptually, [a] dynamic array has an infinite number of fields, all of which are empty except... References made to the [empty fields], for example, return an empty string." (2-11) Essentially array's are initialized to empty strings, rather than NULL.

perform SQL DML (data manipulation language) statements."'' (4-19)
 * Have to explicitly write transaction's isolation levels, they're not inferred at all. (4-15), Then it goes on to say ''"In the SQL environment UniVerse automatically acquires the locks it needs to


 * "However, due to performance issues, for a large system we recommend that you use level 2 [Read Commited] rather than 3 [Repeatable Read] or 4 [Serializable] for most programs. UniVerse provides a default of 0 [No Isolation] for backward compatibility." 4-16


 * The BASIC debugger is named RAID (5-6), there is no reason why. I would presume some hardware RAID developer pissed off a UniVerse guy, and he wanted to make the term more ambiguous. --> Because RAID kills bugs dead!

RetrieVe
RetrieVe: The worst query language ever. It kind of eerily represents SQL. Finally a language with the built in formatting options to output a result set and configure my printer for mailing envelope labels (snide comment at <tt>LIST.LABEL</tt>).
 * Uses <tt>WITH</tt> instead of <tt>WHERE</tt>
 * <tt>LIST</tt>
 * <tt>@</tt> phrase defines what columns are returned with an unqualified <tt>LIST</tt> verb.
 * formatting can be modified with TOXML, ELEMENTS, WITHDTD, WITHSCHEMA. Optional <tt>TO</tt> clause specifies destination file.
 * Destination media in the query with special clause: . supports a different phrase for columns inclusion <tt>@LPTR</tt>
 * Throwaway keywords: English stopwords that assist in making queries readable, excludes other stop words that are vital to queries such as <tt>AND</tt> and <tt>OR</tt>.
 * SQL <tt>TRUNCATE</tt> has an analog called <tt>CLEAR.FILE</tt>.(1-96, pg 125 UniVerse User Reference)
 * SQL <tt>DROP TABLE</tt> has an analog called <tt>DELETE.FILE</tt>.(1-204, pg 233 UniVerse User Reference)
 * SQL <tt>DROP INDEX</tt> has an analog called <tt>DELETE.INDEX</tt>, for an added bonus you can use the keyword <tt>ALL</tt> to delete all of your indexes.(1-207, pg 236 UniVerse User Reference)
 * A cool addition that I don't see in many modern databases, is the ability to suspend all of your indexes, or control them in a useful fashion: UniVerse provides <tt>x.INDEX { x := ENABLE | DISABLE | UPDATE | DELETE }</tt>


 * @ID is an implicitly listed column with the <tt>LIST</tt> verb that must be explicitly disabled if desired by adding <tt>ID.SUP</tt> token to the end of the <tt>LIST</tt> query, or to the beginning of the <tt>@</tt> phrase definition.
 * The qualification of the primary key is implicit
 * can be found rather than
 * can be found rather than


 * RetrieVe about covers the whole gambit of verbs with the same base term of MATCH to different meanings.
 * Built in interactive prompting with <tt>inquiring</tt> token: <tt>LIST file inquiring</tt> prompts the user to enter the record <tt>@id</tt> to list.
 * <tt>UNLIKE</tt> and <tt>LIKE</tt> have similar analogs <tt>MATCHING</tt> and <tt>NOT.MATCHING</tt> with no advice on which one to use or why. This is further compounded by the fact that pattern matching is done through the like-named <tt>MATCH</tt> and its synonym <tt>MATCHES</tt>.


 * Pattern Matching in RetrieVe is limited to matching a single quantifier with a single type: alphabetic (<tt>A</tt>), numeric (<tt>N</tt>), character (<tt>X</tt>).

The implementation of <tt>SELECT LIST</tt> is probably the niftiest thing in RetrieVe and I'd suggest you read about there, there are lots of things too numerous to cover here like permanent storage with <tt>x.LIST (x := SAVE | GET | DELETE | EDIT | COPY | MERGE ), LIST.x ( x := DIFF, INTER, UNION )</tt>, and sublists.
 * The inequality operator (<tt>NE</tt>) has three analogs pound (<tt>#</tt>), and two angle-bracket combos: <tt>&gt;&lt;</tt> and <tt>&lt;&gt;</tt>.
 * <tt>LIKE</tt> or <tt>MATCHING</tt>:
 * uses a multicharacter ellipsis <tt>...</tt> to signify 0 or more of any character, rather than the traditional <tt>%</tt>
 * uses an awkward quantitative syntax for characters, <tt>n[XAN]</tt>: n characters (X), alphanumeric characters (A), numeric characters (N); prefix with <tt>~</tt> for negation  returns <tt>true</tt> if <tt>foo</tt> has no numbers.
 * Literal context is very awkward, rather than <tt>'01/01/2010'::date</tt> it magically knows to interpret <tt>01/01/2010</tt>, I would like to think it was inferred based on the query but I have no reason to believe that's how the voodoo works. Like behaviour can be found with the <tt>MONEY</tt> type,.
 * There is a <tt>BETWEEN</tt> like shorthand that permits  to be rewritten as   omitting the column name; furthermore if the statement is <tt>OR</tt>, you can omit it too:   (greater than 100 or less than 5).
 * An individual query has control over the use of the DB index, with <tt>NO.INDEX</tt> or <tt>REQUIRE.INDEX</tt> (you have to assume what the default behaviour is, and assume that the behaviour won't or shouldn't be adversely affected by you mucking with the index under a different load).
 * Sorting with <tt>SORT</tt> and <tt>BY</tt>:
 * There is a verb <tt>SORT</tt> that emulates an ordered LIST.
 * A <tt>LIST</tt> can be ordered by adding a <tt>BY column</tt> for ascending sort, or <tt>BY.DSND column</tt> for descending sort.
 * Sorting is dependent on the columns justification:
 * right-justified fields are sorted on the numeric and alphanumeric parts separately: <tt>12A</tt>, is grouped with other <tt>12</tt>s and appears wherever <tt>A</tt> is in the collation
 * In addition to the concept of SQL <tt>AS</tt>, a synonym which can be referenced in other parts of the query. RetrieVe also employs a <tt>COL.HDG name</tt> construct, which simply renames the output of the query column, but doesn't change how it is referenced in other parts of the query.
 * <tt>DISPLAY.LIKE</tt> causes an output column to copy the metadata regarding <tt>FMT</tt> and <tt>CONV</tt> from another column accessible from the query.
 * Report Qualifiers: these allow you to control everything about the report "output" from the query. The problem with this feature is the vagueness: outputted from the db, outputted from the client, the configuration of the printer and the terminal, the order id of the kitchen sink... Yea that is all outputish. There are upwards of twenty of these.
 * some of them have to do with configuring a printer: <tt>AUX.PORT</tt>, <tt>LPTR</tt>, <tt>MARGIN</tt>
 * some of them have to do with content of the query <tt>SAMPLE</tt> (output int from top, equiv to <tt>LIMIT int OFFSET 0</tt>), and SAMPLED (output every int'th row), these exist in modernity too
 * some of them totally change the display of the report's content: <tt>VERTICALLY</tt> (orientation), <tt>COL.SPACES</tt> (column spacing), <tt>DBL.SPACES</tt> (row spacing)
 * some of them are little random nuggets <tt>GRAND TOTAL</tt> (amount of "lines" outputted for your report), <tt>FOOTING</tt> (footers), <tt>HEADING</tt> (headers).
 * etc,
 * <tt>SELECT</tt> (LISTS) (not to be confused with SQL SELECT) might not be a bad idea if they were limited to the UI rather than being a feature of the query language. What a SELECT LIST does is to permit you to store a result set retrieved with a full query and then substitute the result set for a like named table:  is roughly the same as  . An unqualified SELECT LIST must be used by the next statement or it will be discarded, additionally you can qualify it by stating which preset 1-10 you wish to store it by suffixing with a <tt>TO</tt> clause like <tt>SELECT ... TO int</tt>, using a stored SELECT LIST is done with the <tt>FROM</tt> qualifier, like this: <tt>LIST col1 FROM int</tt>. The important thing to know is the SELECT list executes the query and saves the @ids, when you apply it selects on the rows referenced by the original @ids it received. You can save foreign keys in a SELECT LIST, and then apply them to a query on the referenced table. You can see if you have an active SELECT LIST by paying attention to the prompt (>> active select list) vs (> no select list)
 * The <tt>SEARCH</tt> verb essentially greps the table.
 * PRIME - find the prime number closest to its argument.

Good ideas: RetrieVe: The report qualifier <tt>SAMPLED int</tt>: returns every nth row.

UniVerse SQL

 * The sentence stack permits you to recall, edit, append, insert, delete sentences which are simply plain text SQL statements that have already been executed.
 * The macro <tt>@SELECT</tt> controls the expansion of the columns in an SQL statement like, except if you're using the file name, in which case it uses the naked <tt>@</tt> macro.
 * The qualification of the primary key can be implicit
 * can be found rather than
 * can be found rather than
 * Inequality operator of <tt>'#'</tt>, still has inequality of <tt>'&lt;&gt;'</tt>
 * The inline prompting mechanism permits you to prompt the user from the UI.
 * The <tt>INQUIRING</tt> keyword forces the user to be prompted to enter the primary key: <tt>SELECT * FROM table INQUIRING;</tt>
 * Query optimizer control in SQL statements with keywords like <tt>NO.OPTIMIZE</tt>


 * Use of <tt>WHEN</tt> clause to differentiate between when multivalued sub-rows should show when the containing row is already in the result set (selected with a <tt>WHERE</tt> or otherwise). This is also used on <tt>UPDATE</tt> if you wish to update only a set few of the rows in a multivalued column.

Good ideas:
 * The <tt>ESCAPE char</tt> clause permits you to specify what escapes the SQL meta-characters like <tt>'%'</tt>.
 * All SQL compliances, JOINS, GROUP BY, HAVING, WHERE,

Editorial Scratchpad
Some people, even self-pronounced UniVerse professionals with a speciality in it like this guy (one of the many) can be stupefied by the internals of UniVerse. He didn't know of the existence of <tt>DICT.DICT</tt> until May of 2009. This isn't really Billy's fault though, here is what the 899 page UniVerse doc has to say about <tt>DICT.DICT</tt>.

DICT.DICT is a system file dictionary that functions as the master file dictionary for RetrieVe processing of all the other UniVerse dictionaries. That's it, lacking isn't it? And, despite that it is the master file dictionary for RetreiVe there are 0 occurrences of <tt>DICT.DICT</tt> in the whole 273 page RetreiVe documentation. .

Installing on Ubuntu
Find my guide on installing and customizing U2 UniVerse on a self-answered question on serverfault.

libodbc.so problems
If you see this you'll need to execute.

From the install directory: || Link libodbc.so to /.uvlibs

Segmentation Fault
If you get a single line that says <tt>Segmentation fault</tt> chances are you feed uv.load to `cpio -ivcBdum`. I'm not sure why they docs say to do this: uv.load is a shell script.

Premature end of File
If you get the following:

You need to edit your uv.load and remove the <tt>-c</tt> argument to cpio by changing line 18.

to

CGI group
On my machine this is the effect of another premature end of file error resulting from.

Ok, for this one you need to remove the <tt>-B</tt> option from the <tt>uv.load</tt>