efro.dataclassio package¶
Submodules¶
efro.dataclassio.extras module¶
Extra rarely-needed functionality related to dataclasses.
- class efro.dataclassio.extras.DataclassDiff(obj1: Any, obj2: Any)[source]¶
Bases:
object
Wraps dataclass_diff() in an object for efficiency.
It is preferable to pass this to logging calls instead of the final diff string since the diff will never be generated if the associated logging level is not being emitted.
Module contents¶
Functionality for importing, exporting, and validating dataclasses.
This allows complex nested dataclasses to be flattened to json-compatible data and restored from said data. It also gracefully handles and preserves unrecognized attribute data, allowing older clients to interact with newer data formats in a nondestructive manner.
- class efro.dataclassio.Codec(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum
Specifies expected data format exported to or imported from.
- FIRESTORE = 'firestore'¶
- JSON = 'json'¶
- class efro.dataclassio.DataclassFieldLookup(cls: type[T])¶
Bases:
Generic
[T
]Get info about nested dataclass fields in type-safe way.
- path(callback: Callable[[T], Any]) str [source]¶
Look up a path on child dataclass fields.
Example
DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar)
The above example will return the string ‘foo.bar’ or something like ‘f.b’ if the dataclasses have custom storage names set. It will also be static-type-checked, triggering an error if MyType.foo.bar is not a valid path. Note, however, that the callback technically allows any return value but only nested dataclasses and their fields will succeed.
- class efro.dataclassio.IOAttrs(storagename: str | None = None, *, store_default: bool = True, whole_days: bool = False, whole_hours: bool = False, whole_minutes: bool = False, soft_default: Any = <efro.dataclassio._base.IOAttrs._MissingType object>, soft_default_factory: Callable[[], Any] | _MissingType = <efro.dataclassio._base.IOAttrs._MissingType object>, enum_fallback: Enum | None = None)¶
Bases:
object
For specifying io behavior in annotations.
‘storagename’, if passed, is the name used when storing to json/etc. ‘store_default’ can be set to False to avoid writing values when equal
to the default value. Note that this requires the dataclass field to define a default or default_factory or for its IOAttrs to define a soft_default value.
- ‘whole_days’, if True, requires datetime values to be exactly on day
boundaries (see efro.util.utc_today()).
- ‘whole_hours’, if True, requires datetime values to lie exactly on hour
boundaries (see efro.util.utc_this_hour()).
- ‘whole_minutes’, if True, requires datetime values to lie exactly on minute
boundaries (see efro.util.utc_this_minute()).
- ‘soft_default’, if passed, injects a default value into dataclass
instantiation when the field is not present in the input data. This allows dataclasses to add new non-optional fields while gracefully ‘upgrading’ old data. Note that when a soft_default is present it will take precedence over field defaults when determining whether to store a value for a field with store_default=False (since the soft_default value is what we’ll get when reading that same data back in when the field is omitted).
- ‘soft_default_factory’ is similar to ‘default_factory’ in dataclass
fields; it should be used instead of ‘soft_default’ for mutable types such as lists to prevent a single default object from unintentionally changing over time.
- ‘enum_fallback’, if provided, specifies an enum value to be substituted
in the case of unrecognized enum values.
- MISSING = <efro.dataclassio._base.IOAttrs._MissingType object>¶
- enum_fallback: Enum | None = None¶
- soft_default: Any = <efro.dataclassio._base.IOAttrs._MissingType object>¶
- soft_default_factory: Callable[[], Any] | _MissingType = <efro.dataclassio._base.IOAttrs._MissingType object>¶
- storagename: str | None = None¶
- store_default: bool = True¶
- validate_datetime(value: datetime, fieldpath: str) None [source]¶
Ensure a datetime value meets our value requirements.
- validate_for_field(cls: type, field: Field) None [source]¶
Ensure the IOAttrs instance is ok to use with the provided field.
- whole_days: bool = False¶
- whole_hours: bool = False¶
- whole_minutes: bool = False¶
- class efro.dataclassio.IOExtendedData¶
Bases:
object
A class that data types can inherit from for extra functionality.
- did_input() None [source]¶
Called on a class instance after created from data.
Can be useful to correct values from the db, etc. in the type-safe form.
- classmethod handle_input_error(exc: Exception) Self | None [source]¶
Called when an error occurs during input decoding.
This allows a type to optionally return substitute data to be used in place of the failed decode. If it returns None, the original exception is re-raised.
It is generally a bad idea to apply catch-alls such as this, as it can lead to silent data loss. This should only be used in specific cases such as user settings where an occasional reset is harmless and is preferable to keeping all contained enums and other values backward compatible indefinitely.
- class efro.dataclassio.IOMultiType¶
Bases:
Generic
[EnumT
]A base class for types that can map to multiple dataclass types.
This enables usage of high level base classes (for example a ‘Message’ type) in annotations, with dataclassio automatically serializing & deserializing dataclass subclasses based on their type (‘MessagePing’, ‘MessageChat’, etc.)
Standard usage involves creating a class which inherits from this one which acts as a ‘registry’, and then creating dataclass classes inheriting from that registry class. Dataclassio will then do the right thing when that registry class is used in type annotations.
See tests/test_efro/test_dataclassio.py for examples.
- classmethod get_type(type_id: EnumT) type[Self] [source]¶
Return a specific subclass given a type-id.
- classmethod get_type_id_storage_name() str [source]¶
Return the key used to store type id in serialized data.
The default is an obscure value so that it does not conflict with members of individual type attrs, but in some cases one might prefer to serialize it to something simpler like ‘type’ by overriding this call. One just needs to make sure that no encompassed types serialize anything to ‘type’ themself.
- class efro.dataclassio.JsonStyle(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)¶
Bases:
Enum
Different style types for json.
- FAST = 'fast'¶
- PRETTY = 'pretty'¶
- SORTED = 'sorted'¶
- efro.dataclassio.dataclass_from_dict(cls: type[T], values: dict, *, codec: Codec = Codec.JSON, coerce_to_float: bool = True, allow_unknown_attrs: bool = True, discard_unknown_attrs: bool = False) T ¶
Given a dict, return a dataclass of a given type.
The dict must be formatted to match the specified codec (generally json-friendly object types). This means that sequence values such as tuples or sets should be passed as lists, enums should be passed as their associated values, nested dataclasses should be passed as dicts, etc.
All values are checked to ensure their types/values are valid.
Data for attributes of type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python’s json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, int values passed for float typed fields will be converted to float values. Otherwise, a TypeError is raised.
If allow_unknown_attrs is False, AttributeErrors will be raised for attributes present in the dict but not on the data class. Otherwise, they will be preserved as part of the instance and included if it is exported back to a dict, unless discard_unknown_attrs is True, in which case they will simply be discarded.
- efro.dataclassio.dataclass_from_json(cls: type[T], json_str: str, coerce_to_float: bool = True, allow_unknown_attrs: bool = True, discard_unknown_attrs: bool = False) T ¶
Return a dataclass instance given a json string.
Basically dataclass_from_dict(json.loads(…))
- efro.dataclassio.dataclass_hash(obj: Any, coerce_to_float: bool = True) str ¶
Calculate a hash for the provided dataclass.
Basically this emits json for the dataclass (with keys sorted to keep things deterministic) and hashes the resulting string.
- efro.dataclassio.dataclass_to_dict(obj: Any, codec: Codec = Codec.JSON, coerce_to_float: bool = True, discard_extra_attrs: bool = False) dict ¶
Given a dataclass object, return a json-friendly dict.
All values will be checked to ensure they match the types specified on fields. Note that a limited set of types and data configurations is supported.
Values with type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python’s json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, integer values present on float typed fields will be converted to float in the dict output. If False, a TypeError will be triggered.
- efro.dataclassio.dataclass_to_json(obj: Any, coerce_to_float: bool = True, pretty: bool = False, sort_keys: bool | None = None) str ¶
Utility function; return a json string from a dataclass instance.
Basically json.dumps(dataclass_to_dict(…)). By default, keys are sorted for pretty output and not otherwise, but this can be overridden by supplying a value for the ‘sort_keys’ arg.
- efro.dataclassio.dataclass_validate(obj: Any, coerce_to_float: bool = True, codec: Codec = Codec.JSON, discard_extra_attrs: bool = False) None ¶
Ensure that values in a dataclass instance are the correct types.
- efro.dataclassio.ioprep(cls: type, globalns: dict | None = None) None ¶
Prep a dataclass type for use with this module’s functionality.
Prepping ensures that all types contained in a data class as well as the usage of said types are supported by this module and pre-builds necessary constructs needed for encoding/decoding/etc.
Prepping will happen on-the-fly as needed, but a warning will be emitted in such cases, as it is better to explicitly prep all used types early in a process to ensure any invalid types or configuration are caught immediately.
Prepping a dataclass involves evaluating its type annotations, which, as of PEP 563, are stored simply as strings. This evaluation is done with localns set to the class dict (so that types defined in the class can be used) and globalns set to the containing module’s class. It is possible to override globalns for special cases such as when prepping happens as part of an execed string instead of within a module.
- efro.dataclassio.ioprepped(cls: type[T]) type[T] ¶
Class decorator for easily prepping a dataclass at definition time.
Note that in some cases it may not be possible to prep a dataclass immediately (such as when its type annotations refer to forward-declared types). In these cases, dataclass_prep() should be explicitly called for the class as soon as possible; ideally at module import time to expose any errors as early as possible in execution.
- efro.dataclassio.is_ioprepped_dataclass(obj: Any) bool ¶
Return whether the obj is an ioprepped dataclass type or instance.
- efro.dataclassio.will_ioprep(cls: type[T]) type[T] ¶
Class decorator hinting that we will prep a class later.
In some cases (such as recursive types) we cannot use the @ioprepped decorator and must instead call ioprep() explicitly later. However, some of our custom pylint checking behaves differently when the @ioprepped decorator is present, in that case requiring type annotations to be present and not simply forward declared under an “if TYPE_CHECKING” block. (since they are used at runtime).
The @will_ioprep decorator triggers the same pylint behavior differences as @ioprepped (which are necessary for the later ioprep() call to work correctly) but without actually running any prep itself.