Ballistica Logo

efro.dataclassio package

Submodules

efro.dataclassio.extras module

Extra rarely-needed functionality related to dataclasses.

class efro.dataclassio.extras.DataclassDiff(obj1: Any, obj2: Any)[source]

Bases: object

Wraps dataclass_diff() in an object for efficiency.

It is preferable to pass this to logging calls instead of the final diff string since the diff will never be generated if the associated logging level is not being emitted.

efro.dataclassio.extras.dataclass_diff(obj1: Any, obj2: Any) str[source]

Generate a string showing differences between two dataclass instances.

Both must be of the exact same type.

Module contents

Functionality for importing, exporting, and validating dataclasses.

This allows complex nested dataclasses to be flattened to json-compatible data and restored from said data. It also gracefully handles and preserves unrecognized attribute data, allowing older clients to interact with newer data formats in a nondestructive manner.

class efro.dataclassio.Codec(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Specifies expected data format exported to or imported from.

FIRESTORE = 'firestore'
JSON = 'json'
class efro.dataclassio.DataclassFieldLookup(cls: type[T])

Bases: Generic[T]

Get info about nested dataclass fields in type-safe way.

path(callback: Callable[[T], Any]) str[source]

Look up a path on child dataclass fields.

Example

DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar)

The above example will return the string ‘foo.bar’ or something like ‘f.b’ if the dataclasses have custom storage names set. It will also be static-type-checked, triggering an error if MyType.foo.bar is not a valid path. Note, however, that the callback technically allows any return value but only nested dataclasses and their fields will succeed.

paths(callback: Callable[[T], list[Any]]) list[str][source]

Look up multiple paths on child dataclass fields.

Functionality is identical to path() but for multiple paths at once.

Example

DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar])

class efro.dataclassio.IOAttrs(storagename: str | None = None, *, store_default: bool = True, whole_days: bool = False, whole_hours: bool = False, whole_minutes: bool = False, soft_default: Any = <efro.dataclassio._base.IOAttrs._MissingType object>, soft_default_factory: Callable[[], Any] | _MissingType = <efro.dataclassio._base.IOAttrs._MissingType object>)

Bases: object

For specifying io behavior in annotations.

‘storagename’, if passed, is the name used when storing to json/etc. ‘store_default’ can be set to False to avoid writing values when equal

to the default value. Note that this requires the dataclass field to define a default or default_factory or for its IOAttrs to define a soft_default value.

‘whole_days’, if True, requires datetime values to be exactly on day

boundaries (see efro.util.utc_today()).

‘whole_hours’, if True, requires datetime values to lie exactly on hour

boundaries (see efro.util.utc_this_hour()).

‘whole_minutes’, if True, requires datetime values to lie exactly on minute

boundaries (see efro.util.utc_this_minute()).

‘soft_default’, if passed, injects a default value into dataclass

instantiation when the field is not present in the input data. This allows dataclasses to add new non-optional fields while gracefully ‘upgrading’ old data. Note that when a soft_default is present it will take precedence over field defaults when determining whether to store a value for a field with store_default=False (since the soft_default value is what we’ll get when reading that same data back in when the field is omitted).

‘soft_default_factory’ is similar to ‘default_factory’ in dataclass

fields; it should be used instead of ‘soft_default’ for mutable types such as lists to prevent a single default object from unintentionally changing over time.

MISSING = <efro.dataclassio._base.IOAttrs._MissingType object>
soft_default: Any = <efro.dataclassio._base.IOAttrs._MissingType object>
soft_default_factory: Callable[[], Any] | _MissingType = <efro.dataclassio._base.IOAttrs._MissingType object>
storagename: str | None = None
store_default: bool = True
validate_datetime(value: datetime, fieldpath: str) None[source]

Ensure a datetime value meets our value requirements.

validate_for_field(cls: type, field: Field) None[source]

Ensure the IOAttrs instance is ok to use with the provided field.

whole_days: bool = False
whole_hours: bool = False
whole_minutes: bool = False
class efro.dataclassio.IOExtendedData

Bases: object

A class that data types can inherit from for extra functionality.

did_input() None[source]

Called on a class instance after created from data.

Can be useful to correct values from the db, etc. in the type-safe form.

classmethod handle_input_error(exc: Exception) Self | None[source]

Called when an error occurs during input decoding.

This allows a type to optionally return substitute data to be used in place of the failed decode. If it returns None, the original exception is re-raised.

It is generally a bad idea to apply catch-alls such as this, as it can lead to silent data loss. This should only be used in specific cases such as user settings where an occasional reset is harmless and is preferable to keeping all contained enums and other values backward compatible indefinitely.

classmethod will_input(data: dict) None[source]

Called on raw data before a class instance is created from it.

Can be overridden to migrate old data formats to new, etc.

will_output() None[source]

Called before data is sent to an outputter.

Can be overridden to validate or filter data before sending it on its way.

class efro.dataclassio.IOMultiType

Bases: Generic[EnumT]

A base class for types that can map to multiple dataclass types.

This enables usage of high level base classes (for example a ‘Message’ type) in annotations, with dataclassio automatically serializing & deserializing dataclass subclasses based on their type (‘MessagePing’, ‘MessageChat’, etc.)

Standard usage involves creating a class which inherits from this one which acts as a ‘registry’, and then creating dataclass classes inheriting from that registry class. Dataclassio will then do the right thing when that registry class is used in type annotations.

See tests/test_efro/test_dataclassio.py for examples.

classmethod get_type(type_id: EnumT) type[Self][source]

Return a specific subclass given a type-id.

classmethod get_type_id() EnumT[source]

Return the type-id for this subclass.

classmethod get_type_id_storage_name() str[source]

Return the key used to store type id in serialized data.

The default is an obscure value so that it does not conflict with members of individual type attrs, but in some cases one might prefer to serialize it to something simpler like ‘type’ by overriding this call. One just needs to make sure that no encompassed types serialize anything to ‘type’ themself.

classmethod get_type_id_type() type[EnumT][source]

Return the Enum type this class uses as its type-id.

class efro.dataclassio.JsonStyle(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)

Bases: Enum

Different style types for json.

FAST = 'fast'
PRETTY = 'pretty'
SORTED = 'sorted'
efro.dataclassio.dataclass_from_dict(cls: type[T], values: dict, *, codec: Codec = Codec.JSON, coerce_to_float: bool = True, allow_unknown_attrs: bool = True, discard_unknown_attrs: bool = False) T

Given a dict, return a dataclass of a given type.

The dict must be formatted to match the specified codec (generally json-friendly object types). This means that sequence values such as tuples or sets should be passed as lists, enums should be passed as their associated values, nested dataclasses should be passed as dicts, etc.

All values are checked to ensure their types/values are valid.

Data for attributes of type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python’s json module (as this would break the ability to do a lossless round-trip with data).

If coerce_to_float is True, int values passed for float typed fields will be converted to float values. Otherwise, a TypeError is raised.

If allow_unknown_attrs is False, AttributeErrors will be raised for attributes present in the dict but not on the data class. Otherwise, they will be preserved as part of the instance and included if it is exported back to a dict, unless discard_unknown_attrs is True, in which case they will simply be discarded.

efro.dataclassio.dataclass_from_json(cls: type[T], json_str: str, coerce_to_float: bool = True, allow_unknown_attrs: bool = True, discard_unknown_attrs: bool = False) T

Return a dataclass instance given a json string.

Basically dataclass_from_dict(json.loads(…))

efro.dataclassio.dataclass_hash(obj: Any, coerce_to_float: bool = True) str

Calculate a hash for the provided dataclass.

Basically this emits json for the dataclass (with keys sorted to keep things deterministic) and hashes the resulting string.

efro.dataclassio.dataclass_to_dict(obj: Any, codec: Codec = Codec.JSON, coerce_to_float: bool = True, discard_extra_attrs: bool = False) dict

Given a dataclass object, return a json-friendly dict.

All values will be checked to ensure they match the types specified on fields. Note that a limited set of types and data configurations is supported.

Values with type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python’s json module (as this would break the ability to do a lossless round-trip with data).

If coerce_to_float is True, integer values present on float typed fields will be converted to float in the dict output. If False, a TypeError will be triggered.

efro.dataclassio.dataclass_to_json(obj: Any, coerce_to_float: bool = True, pretty: bool = False, sort_keys: bool | None = None) str

Utility function; return a json string from a dataclass instance.

Basically json.dumps(dataclass_to_dict(…)). By default, keys are sorted for pretty output and not otherwise, but this can be overridden by supplying a value for the ‘sort_keys’ arg.

efro.dataclassio.dataclass_validate(obj: Any, coerce_to_float: bool = True, codec: Codec = Codec.JSON, discard_extra_attrs: bool = False) None

Ensure that values in a dataclass instance are the correct types.

efro.dataclassio.ioprep(cls: type, globalns: dict | None = None) None

Prep a dataclass type for use with this module’s functionality.

Prepping ensures that all types contained in a data class as well as the usage of said types are supported by this module and pre-builds necessary constructs needed for encoding/decoding/etc.

Prepping will happen on-the-fly as needed, but a warning will be emitted in such cases, as it is better to explicitly prep all used types early in a process to ensure any invalid types or configuration are caught immediately.

Prepping a dataclass involves evaluating its type annotations, which, as of PEP 563, are stored simply as strings. This evaluation is done with localns set to the class dict (so that types defined in the class can be used) and globalns set to the containing module’s class. It is possible to override globalns for special cases such as when prepping happens as part of an execed string instead of within a module.

efro.dataclassio.ioprepped(cls: type[T]) type[T]

Class decorator for easily prepping a dataclass at definition time.

Note that in some cases it may not be possible to prep a dataclass immediately (such as when its type annotations refer to forward-declared types). In these cases, dataclass_prep() should be explicitly called for the class as soon as possible; ideally at module import time to expose any errors as early as possible in execution.

efro.dataclassio.is_ioprepped_dataclass(obj: Any) bool

Return whether the obj is an ioprepped dataclass type or instance.

efro.dataclassio.will_ioprep(cls: type[T]) type[T]

Class decorator hinting that we will prep a class later.

In some cases (such as recursive types) we cannot use the @ioprepped decorator and must instead call ioprep() explicitly later. However, some of our custom pylint checking behaves differently when the @ioprepped decorator is present, in that case requiring type annotations to be present and not simply forward declared under an “if TYPE_CHECKING” block. (since they are used at runtime).

The @will_ioprep decorator triggers the same pylint behavior differences as @ioprepped (which are necessary for the later ioprep() call to work correctly) but without actually running any prep itself.