(select fk_id,max(version) as version from language where fk_id=$1 and version<=$2 group by fk_id) as bb
where
not is_deleted and aa.fk_id=bb.fk_id and aa.version=bb.version
```
This is fairly similar to a version-aware select from a normal table. We constrain on fk_id and group by fk_id,
whereas the normal query constrains on main_id and groups by id.
Because we're using php+sql, we have the related table's id in a variable, so we don't do a two table join,
although we're effectively doing just that via 2 simple queries instead of one more complex query (or one
larger stored procedure).
We save the related table name in language.fk_table. This is redundant, but might be useful for reporting. The
language.fk_id will be related to many tables, and typically we first query the related table, and second
query language. However, there may be a circumstance (which I can hardly imagine) where we only have access to
the language table, but we still want to know all the records relating to nrd. We could do this:
```
select * from language where fk_table='nrd';
```
That query presupposes that table nrd is not avaiable for a join, which would be extremely odd.
The fk_table field might be useful if we exported records to a database which lacks sequences, for example
MySQL, SQLite, or Oracle. Although so many other problems would arise that having fk_table is hardly useful. Besides, we can always populate fk_table with a trivial SQL query:
```
update language set fk_table='nrd' where fk_id in (select id from nrd);
```
Still, inserting values into fk_table seems like a fine, if redundant idea.
### Vocabulary table history and rationale
All controlled vocabulary terms are in a single table. A vocabularies share a common set of fields:
value(term), uri, description. Storing all vocabulary in a single tables makes it easy to add a new
vocabulary. At this time, vocabulary is mono-lingual, that is, term and description exist in a single
language, and might be duplicated in another language.
```
create table if not exists vocabulary (
id int primary key default nextval('vocabulary_id_seq'),
type text, -- Type of the vocab
value text, -- Value of the controlled vocab term
uri text, -- URI for this controlled vocab term, if it exists
description text -- Textual description of this vocab term
);
```
A proposed alternative was to store each vocabulary type in a separate table, but that has down side. Adding a
new type would require creating a new table (with identical field names to existing vocabulary tables) as well
a new insert and select functions, and this is more work than simply inserting new vocabulary data with a new
type. When tables are identical except for table name, table name is data. In this case table name would be
exactly equivalent to vocabulary type.
SQL databases give us an additional hint that vocabulary should be in a single table. We can prepare SQL
queries, and use placeholders. However, table name is not avaiable as a placeholder. The twenty vocabulary
types differ only by "type", and "type" is data best stored as a field, and not as name of twenty tables.
Each "vocabulary" is distinguished by vocaulary.type:
```
type
--------------------
record_type
script_code
entity_type
event_type
name_type
occupation
language_code
gender
nationality
maintenance_status
agent_type
document_role
document_type
function_type
function
subject
date_type
relation_type
place_match
source_type
```
### Vocabulary separate SQL file
The vocabulary related table and sequence SQL schema is in separate file, install/sql_files/vocabulary_init.sql.
This was done in order to facilitate rebuilding the development copies of the database. Vocabulary is fairly
large, and takes between 20 seconds and a minute to load. Developers may wipe their copy of the database
several times per day, but if they aren't working on any vocabulary code, it saves time to leave the vocabulary
tables intact.
See the install script: install/install.php
### Common fields and their meaning
Fields id, main_id, version, and is_deleted are common to all first order data tables.
table.id is the record id, and (table.id,table.version) forms a unique key.
table.main_id=version_history.main_id, and table.main_id is the constellation id for all tables except
nrd. The constellation id of nrd is nrd.id and nrd.main_id, although it is best to go with nrd.id.
table.main_id is a constellation id, that is a grouping column. Across all tables, a single main_id value
identifies all the records belonging to a single constellation. As a constellation is edited, there will be
multiple version numbers, but main_id must remain constant.
table.version=version_history.id and you will notice that in all cases where we select version_history.id we
alias it as version. In SQL.php see: sqlCurrentVersion(), sqlMultiNameConstellationID(), selectDemoRecs().
table.is_deleted is true for deleted records
Remember that with our versioning system this database is insert-only. There are no updates. There will
eventually be a case where some non-constellation data table has an update, but as of Feb 12 2016, SQL.php
does not have a single update.
Delete is accomplished by getting a new version number, and using it along with setting is_deleted to true to
insert the "deleted" record. All the normal queries ignore is_deleted records, and only special admin queries
will ever be able to see deleted records.
### Identical text tables
We have four tables (and their related classes) that contain identically structured
textual data:
```
convention_declaration
structure_genealogy
general_context
mandate
```
As noted above, when the only difference between tables is table name, those tables should be combined. We may
eventually combine all the tables holding data from AbstractTextData class into a single table, but that has
not been done yet.
Each table looks like mandate, differing only by table name:
```
create table mandate (
id int default nextval('id_seq'),
version int not null,
main_id int not null, -- fk to version_history.main_id
is_deleted boolean default false,
text text -- the text term
);
```
Insert to these is handled by insertTextCore(), and selecte via selectTextCore(). The only difference in the
arguments is the table name, that is the "term type".
When the time comes to add a new text item, it will be necessary to add new wrapper insert and select
functions, as well as adding a new table. It is probably better at that time to join the tables together, and
refactor the code to support the single table paradigm as with vocabulary.
### Geographic authority
We are currently working on geographic a authority, based primarily on geonames. The authority table is
geo_place. Table place_link hold relations between original data tables and geo_place.
Currently ony table nrd (the Constellation) has place links.
In theory, place via <place>, <places>, or <placeEntry> can occur in elements: description, chronItem,