Records
Records are flexible ways to compose information coming from various sources. For instance, your processing chain can produce records only containing an ID. Later, you add can retrieve the item content and add it to the record. Further in the processing, you would want to add some transformation of the item content.
Records allow to perform this type of transformations by holding a set of items. Record types form a lattice of types so that checking that some item types are present in an item is easy.
Working with record types
Record types form a lattice of types that can be used to check record properties before hand.
ABRecord = record_type(AItem, BItem)
AB1Record = record_type(AItem, B1Item)
# Hierarchy-based check
assert ABRecord.contains(AB1Record)
# Checks for specific types
assert ABRecord.has(AItem, BItem)
Validating
To ensure that a record fills the requested property, one can use record types
ABRecord = record_type(AItem, BItem)
# OK
ABRecord(AItem(1), BItem(2))
# Fails: A1Item is not AItem
ABRecord(A1Item(1), BItem(2))
# Fails: AItem is not present
ABRecord(BItem(2))
When updating, it is also possible to validate
A1BRecord = record_type(A1Item, BItem)
record = Record(AItem(1), BItem(2))
# Update the ABRecord into a A1/B one
record.update(A1Item(1, 2), target=A1BRecord)
API
- class datamaestro.record.Item
Base class for all item types
- class datamaestro.record.RecordType(*item_types: Type[T])
- __call__(*items: T)
Call self as a function.
- sub(*item_types: Type[T])
Returns a new record type based on self and new item types
- class datamaestro.record.Record(*items: Dict[Type[T], T] | T, override=False)
Associate types with entries
A record is a composition of items; each item base class is unique.
- get(key: Type[T]) T | None
Get a given item or None if it does not exist
- has(key: Type[T]) bool
Returns True if the record has the given item type
- update(*items: T, target: RecordType | None = None) Record
Update some items
- datamaestro.record.record_type(*item_types: Type[T])
Returns a new record type