ocrd.processor.helpers module

Helper methods for running and documenting processors

ocrd.processor.helpers.generate_processor_help(ocrd_tool, processor_instance=None)[source]

Generate a string describing the full CLI of this processor including params.

Parameters
  • ocrd_tool (dict) – this processor’s tools section of the module’s ocrd-tool.json

  • processor_instance (object, optional) – the processor implementation (for adding any module/class/function docstrings)

ocrd.processor.helpers.run_cli(executable, mets_url=None, resolver=None, workspace=None, page_id=None, overwrite=None, log_level=None, input_file_grp=None, output_file_grp=None, parameter=None, working_dir=None)[source]

Open a workspace and run a processor on the command line.

If workspace is not none, reuse that. Otherwise, instantiate an Workspace for mets_url (and working_dir) by using ocrd.Resolver.workspace_from_url() (i.e. open or clone local workspace).

Run the processor CLI executable on the workspace, passing: - the workspace, - page_id - input_file_grp - output_file_grp - parameter (after applying any parameter_override settings)

(Will create output files and update the in the filesystem).

Parameters

executable (string) – Executable name of the module processor.

ocrd.processor.helpers.run_processor(processorClass, ocrd_tool=None, mets_url=None, resolver=None, workspace=None, page_id=None, log_level=None, input_file_grp=None, output_file_grp=None, show_resource=None, list_resources=False, parameter=None, parameter_override=None, working_dir=None)[source]

Instantiate a Pythonic processor, open a workspace, run the processor and save the workspace.

If workspace is not none, reuse that. Otherwise, instantiate an Workspace for mets_url (and working_dir) by using ocrd.Resolver.workspace_from_url() (i.e. open or clone local workspace).

Instantiate a Python object for processorClass, passing: - the workspace, - ocrd_tool - page_id - input_file_grp - output_file_grp - parameter (after applying any parameter_override settings)

Run the processor on the workspace (creating output files in the filesystem).

Finally, write back the workspace (updating the METS in the filesystem).

Parameters

processorClass (object) – Python class of the module processor.