Page MenuHomeFeedback Tracker

Add polymorph object instance support to BaseSerializationContext
Closed, ResolvedPublic

Description

While the BaseSerializationSaveContext supports all data types as far as possible there is one addition that can be made to it to truly make it powerful. And that is to write AND read polymorph objects, so different runtime instances that share a common root. Right now writing is implicitly supported because all detected properties of the concrete instance are written but on read, only the base type is created from the properties because the information about what type was written got lost. Here is an example:

class TAG_BaseType
{
	int m_iBaseValue;
};

class TAG_InheritedType : TAG_BaseType
{
	float m_iAddtionalValue;
};

void Test()
{
    array<ref TAG_BaseType> baseArray();

    TAG_BaseType baseEntry();
    baseEntry.m_iBaseValue = 42;
    baseArray.Insert(baseEntry);

    TAG_InheritedType inheritedEntry();
    inheritedEntry.m_iBaseValue = 1337;
    inheritedEntry.m_iAddtionalValue = 12.34;
    baseArray.Insert(inheritedEntry);

    SCR_JsonSaveContext writer();
    writer.WriteValue("baseArray", baseArray);
    string dataString = writer.ExportToString();
    Print(dataString);

    SCR_JsonLoadContext loader();
    loader.ImportFromString(dataString);
    array<ref TAG_BaseType> deserializedEntries();
    loader.ReadValue("baseArray", deserializedEntries);
    deserializedEntries.Debug();
}

SCRIPT : string dataString = '{"baseArray":[{"m_iBaseValue":42},{"m_iBaseValue":1337,"m_iAddtionalValue":12.34}]}'
SCRIPT : Array count: 2
SCRIPT : [0] => TAG_BaseType<0x0000025C9FE64150>
SCRIPT : [1] => TAG_BaseType<0x0000025C9FE652F0>

The solution to this problem in other serialization implementations like C#'s is to use a type discriminator. These annotate the written data with an extra "magic" field that is usually placed first into any polymorph object written, allowing the deserialization to check for it and adjust its read type on the fly without relying solely on the compile time instance it was passed.

An implementation for this could be like this:

1. Adding the capability to opt into this, as maybe some people don't need it. This could be done by adding

class BaseSerializationContext: Managed
{
	proto external void EnableTypeDiscriminators(string fieldname = "$type");
}

Maybe a second bool param on the method to always write discriminator even on matching base type for consistency, disabled by default? Could be useful for some to have a discriminator field on any object of a potentially mixed array.

2. On writing do the following checks:

- Are type discriminators enabled?
    - Yes?
        - Is the runtime type currently being written differently from the compile-time type passed in? (in the example above TAG_InheritedType vs TAG_BaseType)
            - Yes?
                - WriteValue("$type", inst.Type().ToString())
                - Call remaining to write implementation by either auto-properties or the implemented SerializationSave method.

This only annotates the data if needed and in a place where other deserializes in other languages would expect it, with the name "$type" that is commonly used, but can be adjusted if needed.

2.1 JSON container write

For json container writing the process is straightforward. The additional key inside the object is written and nothing more needs to be done.

2.2 BIN container write

For binary streams, there is a bit of annotation work required so that on reading the data is not interpreted correctly. We can't check if the "$type" key exists, instead before any of the actual object data is written I would propose writing a special single byte like FD for "forward discriminator" followed by the serialized discriminator string data so probably 4-byte string size and then the string bytes. Then the rest of the object data.

3. On Read similar checks can be done

When reading an object either directly as a member of an object or as part of an array/set or map value of objects again check:

- Are type discriminators enabled?
    - Yes?
        - Do we see the "$type" key or special binary sequence peeked from the stream?
            - Yes?
                - Consume the key / read the binary sequence to find the runtime type
                    - See if the "string" type exists AND to avoid any abuse, if the read type has any inheritance from the base compile type passed into the serializer read operation
                        - Type ok?
                            - Does the passed-in compile time type match the read discriminator type?
                                - No? 
                                    - Ignore the passed-in instance and override it (if it was not already null) with the concrete inherited type and continue reading with that one.

A conflict of the binary sequence and matching real class properties by accident should not be possible. Even if the class started with a single byte and then a string look-alike data sequence for whatever reason - if the result of interpreting the peeked data is not producing a type string to use then it would be ignored. and continued to be read as normal. And if it is a polymorph object the type discriminator must have been written there last time so it is always safe to consume it, as the original data comes afterwards. The sequence is not written on base write+read and always on inherited write+read, so that should work out fine.

Should the user attempt to read serialized data with compile time types that do not match the discriminator inheritance tree, throw an error and abort the serializer read. The situation is not recoverable I think. But that is the same as if I wrote my type into a binary stream and deleted the class from script, there is no way to read that anymore. Any other class type that attempts to read it would get data in the wrong layout and produce trash values.

The result would look like this if everything is working:

SCRIPT : string dataString = '{"baseArray":[{"m_iBaseValue":42},{"$type":"TAG_InheritedType","m_iBaseValue":1337,"m_iAddtionalValue":12.34}]}'
SCRIPT : Array count: 2
SCRIPT : [0] => TAG_BaseType<0x0000025C9FE64150>
SCRIPT : [1] => TAG_InheritedType<0x0000025C9FE652F0>

I belive that with "just these simple steps" we could have polymorphic class field members, arrays, sets and even maps (for the value part). The c++ code for polymorphic type annotations when using BaseSerializationSaveContext::WriteGameEntity could be replaced by this system too, offering the same convenience to programmers and modders alike.

Thank you!

Details

Severity
Major
Resolution
Open
Reproducibility
N/A
Operating System
Windows 10 x64
Category
General

Event Timeline

Arkensor created this task.Jun 5 2023, 9:18 AM
Arkensor updated the task description. (Show Details)Jun 5 2023, 9:23 AM
Arkensor updated the task description. (Show Details)
Arkensor updated the task description. (Show Details)Jun 5 2023, 9:26 AM
Arkensor updated the task description. (Show Details)
Arkensor updated the task description. (Show Details)Jun 5 2023, 11:18 AM
Geez changed the task status from New to Assigned.Jun 6 2023, 2:49 PM
Geez closed this task as Resolved.EditedJul 20 2023, 5:26 PM
Geez claimed this task.
Geez added a subscriber: Geez.

Hello Arkensor.
This has been resolved internally and the feature will appear in one of the upcoming updates.