ocrd_models.ocrd_mets module¶
API to METS
-
class
ocrd_models.ocrd_mets.
OcrdMets
(**kwargs)[source]¶ Bases:
ocrd_models.ocrd_xml_base.OcrdXmlDocument
API to a single METS file
-
property
unique_identifier
¶ Get the unique identifier by looking through
mods:identifier
See specs for details.
-
add_agent
(*args, **kwargs)[source]¶ Add an
ocrd_models.ocrd_agent.OcrdAgent
to the list of agents in themetsHdr
.
-
find_all_files
(*args, **kwargs)[source]¶ Like
find_files()
but return a list of all results.Equivalent to
list(self.find_files(...))
-
find_files
(ID=None, fileGrp=None, pageId=None, mimetype=None, url=None, local_only=False)[source]¶ Search
mets:file
entries in this METS document and yield results.The
ID
,fileGrp
,url
andmimetype
parameters can each be either a literal string, or a regular expression if the string starts with//
(double slash).If it is a regex, the leading
//
is removed and candidates are matched against the regex with re.fullmatch. If it is a literal string, comparison is done with string equality.The
pageId
parameter supports the numeric range operator..
. For example, to find all files in pagesPHYS_0001
toPHYS_0003
,PHYS_0001..PHYS_0003
will be expanded toPHYS_0001,PHYS_0002,PHYS_0003
.- Keyword Arguments
ID (string) –
@ID
of themets:file
fileGrp (string) –
@USE
of themets:fileGrp
to list files ofpageId (string) –
@ID
of the corresponding physicalmets:structMap
entry (physical page)url (string) –
@xlink:href
(URL or path) ofmets:Flocat
ofmets:file
mimetype (string) –
@MIMETYPE
ofmets:file
local (boolean) – Whether to restrict results to local files in the filesystem
- Yields
ocrd_models:ocrd_file:OcrdFile
instantiations
-
add_file_group
(fileGrp)[source]¶ Add a new
mets:fileGrp
.- Parameters
fileGrp (string) –
@USE
of the newmets:fileGrp
.
-
remove_file_group
(USE, recursive=False, force=False)[source]¶ Remove a
mets:fileGrp
(single fixed@USE
or multiple regex@USE
)- Parameters
USE (string) –
@USE
of themets:fileGrp
to delete. Can be a regex if prefixed with//
recursive (boolean) – Whether to recursively delete each
mets:file
in the groupforce (boolean) – Do not raise an exception if
mets:fileGrp
does not exist
-
add_file
(fileGrp, mimetype=None, url=None, ID=None, pageId=None, force=False, local_filename=None, ignore=False, **kwargs)[source]¶ Instantiate and add a new
ocrd_models.ocrd_file.OcrdFile
.- Parameters
fileGrp (string) –
@USE
ofmets:fileGrp
to add to- Keyword Arguments
mimetype (string) –
@MIMETYPE
of themets:file
to useurl (string) –
@xlink:href
(URL or path) of themets:file
to useID (string) –
@ID
of themets:file
to usepageId (string) –
@ID
in the physicalmets:structMap
to link toforce (boolean) – Whether to add the file even if a
mets:file
with the same@ID
already exists.ignore (boolean) – Do not look for existing files at all. Shift responsibility for preventing errors from duplicate ID to the user.
local_filename (string) –
-
remove_file
(*args, **kwargs)[source]¶ Delete each
ocrd:file
matching the query. Same arguments asfind_files()
-
remove_one_file
(ID)[source]¶ Delete an existing
ocrd_models.ocrd_file.OcrdFile
.- Parameters
ID (string) –
@ID
of themets:file
to delete- Returns
The old
ocrd_models.ocrd_file.OcrdFile
reference.
-
property
physical_pages
¶ List all page IDs (the
@ID
of each physicalmets:structMap
mets:div
)
-
get_physical_pages
(for_fileIds=None)[source]¶ List all page IDs (the
@ID
of each physicalmets:structMap
mets:div
), optionally for a subset ofmets:file
@ID
for_fileIds
.
-
set_physical_page_for_file
(pageId, ocrd_file, order=None, orderlabel=None)[source]¶ Set the physical page ID (
@ID
of the physicalmets:structMap
mets:div
entry) corresponding to themets:file
ocrd_file
, creating all structures if necessary.- Parameters
pageId (string) –
@ID
of the physicalmets:structMap
entry to useocrd_file (object) – existing
ocrd_models.ocrd_file.OcrdFile
object
- Keyword Arguments
order (string) –
@ORDER
to useorderlabel (string) –
@ORDERLABEL
to use
-
get_physical_page_for_file
(ocrd_file)[source]¶ Get the physical page ID (
@ID
of the physicalmets:structMap
mets:div
entry) corresponding to themets:file
ocrd_file
.
-
merge
(other_mets, fileGrp_mapping=None, after_add_cb=None, **kwargs)[source]¶ Add all files from other_mets.
Accepts the same kwargs as
find_files()
- Keyword Arguments
fileGrp_mapping (dict) – Map
other_mets
fileGrp to fileGrp in this METSafter_add_cb (function) – Callback received after file is added to the METS
-
property