Commit 1f09b4ab by twl8n

adding proposed web app flow docs

parent ad1eab68
...@@ -159,14 +159,14 @@ database returning match strength. Mode two is somewhat unclear. ...@@ -159,14 +159,14 @@ database returning match strength. Mode two is somewhat unclear.
- workflow manager (Tom) - workflow manager (Tom)
Every action the application can perform is part of the work flow. All these actions along with their Every action the application can perform is part of the work flow. The names of these actions along with names
requisites are organized into a work flow table. The work flow engine does not know how to do real work, but of their requisites are organized into a work flow table. The work flow engine does not know how to do real
it does know the names of the functions which do the real work. When a new task is added to the application, work, but it does know the names of the functions which do the real work. A new feature (aka function, task)
that task must be added to the work flow, and a function created in the application to perform the is added to the application, by adding its name to the work flow, and creating a function of the same name in
task. Likewise, requistes are determined by boolean functions, and every requisite must have a matching the application. Likewise, requistes are determined by boolean functions, and every requisite must have a
function known to the work flow engine. By including requisites, the work flow can enforce role-based matching function known to the work flow engine. The work flow enforces role-based behavior by testing the
behavior. The workflow engine exists, but needs to be ported from Perl to PHP, and the work flow data may need requisites. The workflow engine exists, but needs to be ported from Perl to PHP, and the work flow data should
to be stored in the SQL database. be stored in the SQL database.
- Support for work history and task staging. - Support for work history and task staging.
...@@ -238,6 +238,78 @@ easily add new export features to support emerging standards. ...@@ -238,6 +238,78 @@ easily add new export features to support emerging standards.
The ability to export all the data for preservation purposes also gives us the ability to offer bulk data The ability to export all the data for preservation purposes also gives us the ability to offer bulk data
downloads to researchers and collaborating peer institutions. downloads to researchers and collaborating peer institutions.
#### Web application overview
Some aspects of the web app aren't yet clear, so there are details to be worked out, and some large-ish
concepts to clarify. I'm guessing we will agree on most things, and one of us or the other will just concede
on stuff where we don't agree.
Requirements:
- expose an http accessible API that is viable for wget, browser <form>, and Ajax calls.
- Supported input format depends on the complexity of the requested operation.
- Public functions require no authentication. Everything else must include authentication data.
Internal flow:
1. validate the inputs.
1. Somehow slice and dice the CGI params of the REST call into an abstracted request we can pass to the
internal API. I suppose that the external and internal APIs are very similar, but we almost certainly need
some level of symbolic reference aka abstraction. Each REST call has its requisite data. Some data is as
simple as a record id, and some will be fairly interesting json data structures.
1. The web app API does the tasks specified by the REST request and the work flow engine's directions.
a. Every http request must go through the work flow engine so that the work flow is validated and managed. Every
web app has a work flow, but people mostly just cobble that together with a bunch of implied functionality
using conditionals and side-effect-full function calls. In our code, the internal API is 100% work flow
agnostic. I can explain this in more detail, but it makes a huge improvement in the structure of the
application.
1. Create the output data object if it wasn't created by the functions doing the work.
The work flow engine needs a number of functions that read application data and return booleans so that the
work flow engine can detect the application's relevant state. I guess that sounds confusing because the work
flow engine has state, and the application has state. Those two types of state are vastly different and only
related to each other in that the work flow engine can detect the application's state. The internal API of the
web app has no idea that the work flow engine even exists. And the work flow engine knows what work needs to
be done, but has no idea how it will be done.This is a very lovely separation of concerns.
My strong preference is to use an existing template module for output. I have attached a CPF xml template
based on the Template Tookit http://www.template-toolkit.org/ for which there is a Perl module, and apparently
a Python module. Template modules are fairly common, so I'm almost certain we will have several to choose from
in PHP.
I don't know how you feel about web frameworks, but I don't like them. Too hard to work with and very little
functionality I need, and often they break MVC, and they generally made debugging nearly impossible. Much
better to use a select few modules to create a lightweight quasi-framework that is perfectly matched to our
needs.
The internal API does its work, and creates the data for the output. Output data is passed to a rendering
layer that relies on the template module. The only code that knows anything about rendering is the rendering
layer. To all the non-rendering code, there is only "output data" which does conform to a standard structure
(almost certainly an output data object). The rendering layer uses the output object, and the requested format
of the output (text, html, pdf, xml, etc.) to create the output. Happily, "rendering" is generally a single
function call. Create a template object, call its "render" method with two arguments: (1) template file name,
(2) the output data object. Default behavior is to write the output to stdout, but the render method can also
return the output in a variable so we can create an http download.
Templates are human created static files with placeholders. The template engine fills in the placeholders with
values from relevant parts of the output data. Clearly, the output data object and the template must share a
object/property naming convention. The template engine functionality has single value fields, looping over
input lists, and if statement branching based on input. But that's pretty much it. No work is done in the
template that is not directly concerned with filling in placeholders, not even formatting (in the sense of
rounding numbers, capitalizing strings, or adding html tags). Templates are valid documents of the output
type, except in rare cases. The attached template is well-formed XML.
The web app needs a file download output option as well as output to stdout.
#### Data background #### Data background
The data is in a SQL database. Every piece of data is in a separate field to the extent that is practical. The data is in a SQL database. Every piece of data is in a separate field to the extent that is practical.
...@@ -333,83 +405,68 @@ group-permission management. ...@@ -333,83 +405,68 @@ group-permission management.
We have created a generalized workflow system (as opposed to an ad-hoc linked set of reports). There is a work We have created a generalized workflow system (as opposed to an ad-hoc linked set of reports). There is a work
flow state table which needs to be moved into the database. flow state table which needs to be moved into the database.
Need fields to deal with delete/embargo. This may be best implemented Need fields to deal with delete/embargo. This may be best implemented via a trigger or perhaps a view. By
via a trigger or perhaps a view. By making what appear to be simple making what appear to be simple SELECTs through a view, the view can exclude deleted records. We must think
SELECTs through a view, the view can exclude deleted records. We must about how using a view (or trigger) will effect UPDATE and INSERT. Ideally the view is transparent. Is there
think about how using a view (or trigger) will effect UPDATE and INSERT. some clever way we can restrict access to the original table only via the view?
Ideally the view is transparent. Is there some clever way we can
restrict access to the original table only via the view? Need record lock on some types of records. This lock needs to be honored by several modules, so like “delete”,
lock might best be implemented via a view and we \*only\* access the table in question via the view.
Need record lock on some types of records. This lock needs to be honored
by several modules, so like “delete”, lock might best be implemented via If there are different levels of review for different elements in the record, then we need extra granularity
a view and we \*only\* access the table in question via the view. in the workflow or the edited record info to know the type of record edited apropos of workflow variations.
If there are different levels of review for different elements in the If there different reviewers for different parts of the record, then workflow data (and workflow
record, then we need extra granularity in the workflow or the edited configuration) needs to be able to notify multiple people, and would have to get multiple reviewer approvals
record info to know the type of record edited apropos of workflow
variations.
If there different reviewers for different parts of the record, then
workflow data (and workflow configuration) needs to be able to notify
multiple people, and would have to get multiple reviewer approvals
before moving to the next phase of the workflow. before moving to the next phase of the workflow.
Institutional affiliation is probably common enough to want a field in Institutional affiliation is probably common enough to want a field in the user table, as opposed to creating
the user table, as opposed to creating a group for each institution. The a group for each institution. The group is perhaps more generalized and could behave identical (or almost
group is perhaps more generalized and could behave identical (or almost
identical) to a field (with controlled vocabulary) in the user table. identical) to a field (with controlled vocabulary) in the user table.
Make sure we can write a query (report) to count numbers of records Make sure we can write a query (report) to count numbers of records based type of edit, institution of the
based type of edit, institution of the editor, and number of holdings. editor, and number of holdings.
If we want to be able to quickly count some CPF element such as outgoing If we want to be able to quickly count some CPF element such as outgoing links from CPF to a given
links from CPF to a given institution, then we should put those CPF institution, then we should put those CPF values into the SQL database, as meta data for the CPF record.
values into the SQL database, as meta data for the CPF record.
What is: How many referral links to EAC records that they created? What is: How many referral links to EAC records that they created?
Be able to count record views, record downloads. Institutional dashboard Be able to count record views, record downloads. Institutional dashboard reports need the ability to group-by
reports need the ability to group-by user, or even filter to a specific user, or even filter to a specific user.
user.
Reporting needs to help managers verify performance metrics. This assumes that all changes have a
Reporting needs to help managers verify performance metrics. This date/timestamp. Once workflow and process decisions are set, performance requirements for users such as
assumes that all changes have a date/timestamp. Once workflow and load/performance (how many updates and changes to records can be handled at once), search response time, edit
process decisions are set, performance requirements for users such as time (outside of review workflow), and update times need to be set.
load/performance (how many updates and changes to records can be handled
at once), search response time, edit time (outside of review workflow), Effort reporting to allow SNAC and participants to communicate to others the actual level of effort
and update times need to be set. involved. This sounds like a report with time span and numbers of records handled in various ways. SNAC might
use this when going from pilot into production so that everyone knows what effort will be required for X
Effort reporting to allow SNAC and participants to communicate to others number of records/actions (of whatever action type).
the actual level of effort involved. This sounds like a report with time
span and numbers of records handled in various ways. SNAC might use this Time/activity reporting could allow us to assess viability, utility, and efficiency of maintenance system
when going from pilot into production so that everyone knows what effort processes.
will be required for X number of records/actions (of whatever action
type). Similar reports might be generated to evaluate the discovery interface. Something akin to how much time was
required to access a certain number of records. Rachael said: Assess viability of access funtionality-
Time/activity reporting could allow us to assess viability, utility, and
efficiency of maintenance system processes.
Similar reports might be generated to evaluate the discovery interface.
Something akin to how much time was required to access a certain number
of records. Rachael said: Assess viability of access funtionality-
performance time, available features, and ease of use. performance time, available features, and ease of use.
We could try to report on the amount of training necessary before a new We could try to report on the amount of training necessary before a new user was able to work independently in
user was able to work independently in each of various areas (content each of various areas (content input, review, etc.)
input, review, etc.)
#### Merge and watch #### Merge and watch
Note: Ask Robbie what the database architecture is to support merged records. Note: Ask Robbie what the database architecture is to support merged records.
Users may "watch" an identity. If a file is being watched, and that file is part of an description (merged or single) then the watch will Users may "watch" an identity. If a file is being watched, and that file is part of an description (merged or
apply to the results of human edits, regardless of which part of the description was modified. It is possible single) then the watch will apply to the results of human edits, regardless of which part of the description
for someone to wish to track a biogHist, but that biogHist could be completely removed in lieu of an improved was modified. It is possible for someone to wish to track a biogHist, but that biogHist could be completely
and updated description. We do not track individual elements in CPF. We only track an entire description, removed in lieu of an improved and updated description. We do not track individual elements in CPF. We only
regardless the watcher's motivation. The original motivation for watching might no longer exist after an edit, track an entire description, regardless the watcher's motivation. The original motivation for watching might
and if so, the watcher can simply disable their watch. After each edit, all watchers will get a no longer exist after an edit, and if so, the watcher can simply disable their watch. After each edit, all
notification. The watch does not apply to any single field, but to the entire description, and therefore also watchers will get a notification. The watch does not apply to any single field, but to the entire description,
to future descriptions which result from merging. and therefore also to future descriptions which result from merging.
What happens to a watch on a merged description which is subsequently split? Does the watch apply to both What happens to a watch on a merged description which is subsequently split? Does the watch apply to both
split descriptions or to neither description? Perhaps is it best to disable the watch, and inform the watcher split descriptions or to neither description? Perhaps is it best to disable the watch, and inform the watcher
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment