We will add fields to table related_resource, and (perhap?) create new table related_resource_name. We will have to go back to the originally extracted SNAC CPF files because we need additional fields from both MODS and EAD.
We will add fields to table related_resource, and create new table related_resource_creator (was
related_resource_name). We will have to go back to the originally extracted SNAC CPF files because we need 2
additional fields from MODS (datafield 040$a, 300), and 1 field from EAD (prefercite).
In the case of MODS we need 2 fields.
From WorldCat MARC we need:
1) the original WorldCat "institution identifier" which is either an OCLC Symbol or a Marc Organization
1) The original WorldCat 040$a "institution identifier" which is either an OCLC Symbol or a Marc Organization
Identifier. A software pipeline will process this data into additional fields in table related_resource, as
Identifier. We already have this data, and a software pipeline is processing this data into additional fields
well as new constellations for each repository. Ideally, the new constellations will have an address and other
in table related_resource, as well as new constellations for each repository. Ideally, the new constellations
geographic information as part of their place element.
will have an address and other geographic information as part of their place element.
2) The 300 extent data which is human readable information about the size/extent of the archival materials
2) the 300 extent data which is human readable information about the size/extent of the archival materials
In the case of EAD we need the \<prefercite> element where there is no \<repository> element. We will review
In the case of EAD we need the \<prefercite> element where there is no \<repository> element. We will review
the data since the two elements are not supposed to be interchangable. Element prefercite contains the
the data since the two elements are not supposed to be interchangable. Element prefercite contains the
...
@@ -25,21 +28,42 @@ institution name and often the address as well, often in a single line. There is
...
@@ -25,21 +28,42 @@ institution name and often the address as well, often in a single line. There is
repository is missing or empty. We there is a good repository, prefercite usually seems to be left out or
repository is missing or empty. We there is a good repository, prefercite usually seems to be left out or
empty.
empty.
Question: Why are we using resource name instead of doing a join to the related constellation via ic_id? Does
Repository name/info is saved in a constellation, and any resources that need it will link to it via ic_ic as
related_resource_name exist for performance reasons?
a foreign key relation. A repository's role is always "repository" or
```
http://id.loc.gov/vocabulary/relators/rps
```
todo: are there other role types? Answer: There sort of can't be, but we need to do something with them.
Need to scan through the data for non-repo, non-orig.
Note: repository is superceded by looking up oclc/marc org code or ead repo.
Resource language is a many-to-one. Use a reverse foreign key from the multiple records in the language
table, back to the related_resoure.id.
related_resource_creator is creator aka origination aka originationName as from ResourceRelationstoPostgres.txt
Example: A constellation has resourceRelation to a field book (archival object) written by Clausen, Jens (Jens
Christian), 1891-1969. The related resource creator is Clausen, Jens (Jens Christian), 1891-1969.
Resource language is a many-to-one relation related resource. We use a reverse foreign key from each related
record in the language table, back to the related_resoure.id. This means we have to parse the EAD
langmaterial/language elements.
We do not have language for MARC derived records. While it is possible to get some language info from the MARC
546, the values are discursive as opposed to language codes, or even a textual language name.
```
```
language.fk_id=related_resource.id and language.fk_table='related_resource'
language.fk_id=related_resource.id and language.fk_table='related_resource'
```
```
Resource name is a many-to-one. Use a reverse foreign key from related_resource_name.fk_id to related_resource.id.
Resource creator name is a many-to-one. Use a reverse foreign key from related_resource_creator.fk_id to related_resource.id.
```
```
related_resource_name.fk_id=related_resource.id and related_resource_name.fk_table='related_resource'
related_resource_creator.fk_id=related_resource.id and related_resource_creator.fk_table='related_resource'
```
```
The place and address associated with a repository is handled via a place in the constellation. Address data
The place and address associated with a repository is handled via a place in the constellation. Address data
...
@@ -63,12 +87,14 @@ Several new fields are added to the related_resource table:
...
@@ -63,12 +87,14 @@ Several new fields are added to the related_resource table:
As noted above, we did not capture the MARC 300 extent, so I will have to parse the original WorldCat records for that data.
As noted above, we did not capture the MARC 300 extent, so I will have to parse the original WorldCat records for that data.
The related_resource_name table is new.
The related_resource_creator table is new.
```
```
--
-- The role aka roleTerm of repo_ic_id is always http://id.loc.gov/vocabulary/relators/rps