Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
  • This project
    • Loading...
  • Sign in / Register
Documentation
Documentation
  • Project
    • Overview
    • Details
    • Activity
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 2
    • Issues 2
    • List
    • Board
    • Labels
    • Milestones
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Commits
  • Issue Boards
  • snac
  • DocumentationDocumentation
  • Issues
  • #3

Closed
Open
Opened Aug 18, 2015 by Tom Laudeman@twl8n 
  • Report abuse
  • New issue
Report abuse New issue

clean various characters from data

These are the "start of string" and "string terminator" characters. Unclear about their use, but it appears they are almost certainly not part of name strings.

Aachen, Johann \230von\234 1552-1615

\230 #x98 \234 #x9c

von


  • clean this char that has an ampersand glyph but is not ampersand: & hex ff06

http://shannon.village.virginia.edu:8088/xtf/data/ead_taro/taro/sjmh/00064.r004.xml

/data/source/findingAids/taro/sjmh/00064.xml


The dreaded "Left-to-Right Embedding". The text below should not even be rendered left-to-right.

I think there's a regex in clean.pl to deal with this. See: git@github.com:snac-cooperative/snac_ead_to_cpf.git

Juárez-Lincoln Center (Austin, Tex.)

Assignee
Assign to
October 16 Sprint (3 weeks)
Milestone
October 16 Sprint (3 weeks)
Assign milestone
Time tracking
None
Due date
No due date
0
Labels
None
Assign labels
  • View project labels
Reference: snac/Documentation#3