ocrd.resolver module

class ocrd.resolver.Resolver[source]

Bases: object

Handle uploads, downloads, repository access, and manage temporary directories

download_to_directory(directory, url, basename=None, if_exists='skip', subdir=None)[source]

Download a file to a directory.

Early Shortcut: If url is a local file and that file is already in the directory, keep it there.

If basename is not given but subdir is, assume user knows what she’s doing and use last URL segment as the basename.

If basename is not given and no subdir is given, use the alnum characters in the URL as the basename.

Parameters
  • directory (string) – Directory to download files to

  • basename (string, None) – basename part of the filename on disk.

  • url (string) – URL to download from

  • if_exists (string, "skip") – What to do if target file already exists. One of skip (default), overwrite or raise

  • subdir (string, None) – Subdirectory to create within the directory. Think mets:fileGrp.

Returns

Local filename string, relative to directory

workspace_from_url(mets_url, dst_dir=None, clobber_mets=False, mets_basename=None, download=False, src_baseurl=None)[source]

Create a workspace from a METS by URL (i.e. clone if mets_url is remote or dst_dir is given).

Parameters

mets_url (string) – Source METS URL or filesystem path

Keyword Arguments
  • dst_dir (string, None) – Target directory for the workspace. By default create a temporary directory under ocrd.constants.TMP_PREFIX. (The resulting path can be retrieved via ocrd.Workspace.directory.)

  • clobber_mets (boolean, False) – Whether to overwrite existing mets.xml. By default existing mets.xml will raise an exception.

  • download (boolean, False) – Whether to also download all the files referenced by the METS

  • src_baseurl (string, None) – Base URL for resolving relative file locations

Download (clone) mets_url to mets.xml in dst_dir, unless the former is already local and the latter is none or already identical to its directory name.

Returns

a new Workspace

workspace_from_nothing(directory, mets_basename='mets.xml', clobber_mets=False)[source]

Create an empty workspace.

Parameters

directory (string) – Target directory for the workspace. If none, create a temporary directory under ocrd.constants.TMP_PREFIX. (The resulting path can be retrieved via ocrd.Workspace.directory.)

Keyword Arguments

clobber_mets (boolean, False) – Whether to overwrite existing mets.xml. By default existing mets.xml will raise an exception.

Returns

a new Workspace