*[Controlled vocabularies and tag system](#controlled-vocabularies-and-tag-system)
### Big questions
- (solved) how is gitlab backed up?
...
...
@@ -9,7 +22,7 @@
- We need a complete understanding of our requirements for controlled vocabulary, ontology, and/or tagging as
well as how this relates to search facets. This also impacts our future ability to make assertions about the
data, and is somewhat related to semantic net. See [Tag system](#controlled-vocabularies-and-tag-system).
### Documents we need to create
- Operations and Procedure Manual
...
...
@@ -23,9 +36,9 @@
- User Story Backlog
- Design Documents (UI/UX/Graphic Design)
- ideally someone writes a (possibly brief) style guide
- a set of .psd or other images is not a style guide
### Governance and Policies, etc.
...
...
@@ -57,7 +70,7 @@
### Non-component notes to be worked into requirements
### Non-component notes to be worked into requirements
- CPF record edit, edit each field
- CPF record split, split data into separate cpf identities, deprecate old ARK, mint new ARKs
...
...
@@ -71,11 +84,11 @@
- Data validation engine
-**API:** Custom JSON (needs formal spec)
- The data validation engine applies a written system of rules to the incoming data. The rules must be written in a human-readble form, such that non-technical individuals are able to write and understand the rules. A rule-writing guide must be supplied to give hints and help for writing rules properly. The engine will be pluggable and written as an MVC application. The model will read the user-written rules (stored in a postgres database or flat-file system, depending on the model) and apply them to any input given on the view. The initial view will be a JSON API, which accepts individual fields and returns the input with either validation suggestions or valid flags.
- The data validation engine applies a written system of rules to the incoming data. The rules must be written in a human-readble form, such that non-technical individuals are able to write and understand the rules. A rule-writing guide must be supplied to give hints and help for writing rules properly. The engine will be pluggable and written as an MVC application. The model will read the user-written rules (stored in a postgres database or flat-file system, depending on the model) and apply them to any input given on the view. The initial view will be a JSON API, which accepts individual fields and returns the input with either validation suggestions or valid flags.
- rule based system abstracted out of the code
- rules are data
- change the rules, not the actual application code
- change the rules, not the actual application code
- rules for broad classes of data type, or granular rules for individual fields
- probably used this to untaint data as well (remove things that are potential security problems)
***We need to stop now, write requirements, then apply those requirements moving forward to ensure we meet the requirements**
- should we re-architect tables become normal tables, the views go away, and versioned records are moved to shadow tables.
- add features for delete-via-mark (as opposed to actual delete)
- add features to support embargo
-*maybe, discuss* change vocabulary.sql from insert to copy. It would be smaller and faster, although in reality as soon as
it is in the database, the text file will never be touched again.
- discuss; Can/should we create a tag system to deal with ad-hoc requirements later in the project? [Tag system](#controlled-vocabularies-and-tag-system)
- CPF to SQL parser (Robbie)
...
...
@@ -121,7 +134,7 @@
- exists, needs tests, needs requirements
***We need to stop now, write requirements, then apply those requirements moving forward to ensure we meet the requirements**
- Can we find a grammar-based parser for PHP? Should we use a standalone parser?
- Can we expose this as a JSON API such that it's given a name-string and returns an identity object of that identity's information? Possibly a score as well as to how well we thought we could parse it?
- Name parser (only a portion of the NameEntity string)
- Name parser (only a portion of the NameEntity string)
-**API:** subroutine? JSON?
- Can this use the same parser engine as the name string parser?
...
...
@@ -165,7 +178,7 @@
- authentication
- session management, especially as applies to authentication tokens, cookies and something which prevents
XSS (cross-site scripting) attacks
- JavaScript UI component tools, JQuery; what others?
- Suggestions: bootstrap, angular JS, JQueryUI
...
...
@@ -177,7 +190,7 @@
#### Controlled vocabularies and tag system
#### Controlled vocabularies and tag system
Tags are simply terms. When inplemented as fixed terms with persistent IDs and some modify/add policy, tags
become a flat (non-hierarchal) controlled vocabulary.
...
...
@@ -187,8 +200,8 @@ would consist of tag term and an ID value. Tag systems lack data type, and gener
restrictive policies about creating new tags
Below are some subject examples. It is unclear if these are each topics, or if "--" is used to join granular
topics into a topic list. Likewise it is unclear if this list relies on some explicit hierarchy.
topics into a topic list. Likewise it is unclear if this list relies on some explicit hierarchy.
```
American literature--19th century--Periodicals
American literature--20th century--Periodicals
...
...
@@ -201,4 +214,3 @@ World politics--Study and teaching
```
**RH: I agree, this is super tricky. We need to distinguish between types of controlled vocab, so that we don't mix Occupation and Subject. A tagging system might be very nice for at least subjects.**