ocrd_validators.workspace_validator module

Validating a workspace.

class ocrd_validators.workspace_validator.WorkspaceValidator(resolver, mets_url, src_dir=None, skip=None, download=False, page_strictness='strict', page_coordinate_consistency='poly')[source]

Bases: object

Validates an OCR-D/METS workspace against the specs.

Construct a new WorkspaceValidator.

Parameters
  • resolver (Resolver) –

  • mets_url (string) –

  • src_dir (string) –

  • skip (list) –

  • download (boolean) –

  • page_strictness ("strict"|"lax"|"fix"|"off") –

  • page_coordinate_consistency ("poly"|"baseline"|"both"|"off") –

static check_file_grp(workspace, input_file_grp=None, output_file_grp=None, page_id=None, report=None)[source]

Return a report on whether input_file_grp is/are in workspace.mets and output_file_grp is/are not. To be run before processing

Parameters
  • workspacec (Workspace) –

  • input_file_grp (list|string) –

  • output_file_grp (list|string) –

  • page_id (list|string) –

static validate(*args, **kwargs)[source]

Validates the workspace of a METS URL against the specs

Parameters
  • resolver (ocrd.Resolver) – Resolver

  • mets_url (string) – URL of the METS file

  • src_dir (string, None) – Directory containing mets file

  • skip (list) – Tests to skip. One or more of ‘mets_unique_identifier’, ‘mets_file_group_names’, ‘mets_files’, ‘pixel_density’, ‘dimension’, ‘url’

  • download (boolean) – Whether to download files

Returns

report (ValidationReport) Report on the validity