Author Topic: Reading Ruby's Marshal Format  (Read 9447 times)

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Reading Ruby's Marshal Format
« on: December 08, 2013, 06:17:13 AM »
So I've been trying to study Ruby's Marshal binary format. And I've figured out a few things and I'm still trying to figure a couple of things out, maybe some of you guys can help. I've only serialized a few data types and identified a few key bytes.

Here's some code.
Code: [Select]
class ZObject
  attr_accessor :data
  def initialize(obj = nil)
    @data = obj
  end
end

  obj = [ZObject.new(true), ZObject.new(false)]
  file = File.open("format.bin", "wb")
  Marshal.dump([5, 6, 7, 8], file)
  Marshal.dump(obj, file)
  file.close

Here's the output in bytes and the respective characters

04 08 5B 09 69 0A 69 0B 69 0C 69 0D 04 08 5B 07 - ..[.i.i.i.i...[.
6F 3A 0C 5A 4F 62 6A 65 63 74 06 3A 0A 40 64 61 - o:.ZObject.:.@da
74 61 54 6F 3B 00 06 3B 06 46                   - taTo;..;.F


I read somewhere that the first two bytes are the Marshal's Major and Minor version. So in this case, It would be 4.8.
Arrays are identified with the byte 0x5B, the byte afterwards is the number of entries in the array. However, what's odd, integers are serialized as 5 higher than they should. 0x09 is the size of the array, subtract 5 and you get 4. Integers are identified with the byte 69 (giggity) and the byte(s) following are the actualy integer. Again, following the 5 higher than the real value. Not sure why it is yet, and I've only done small so far, I haven't done 255 or higher. There are no bytes identified as separators. One thing I found out, everytime Marshal.dump is called, it dumps the version number again, as you'll see, after the first array was dumped, you'll find the bytes 04 and 08 again.

True is identified as a capitol T (0x54) while False, capitol F (0x46).

Now let's look at bytes 6F, 3A, 0C, at the beginning of the 2nd line. I'm not entirely sure, but I've narrowed 6F to tell Ruby it's a serialized class. The byte afterwards 3A, I'm not too sure about, I think it specifies a symbol, class, variable, because right after is the size of the class name/variable name. 0C is the value size of the class name, 0C is 12, minus 5, is 7, which is the number of characters in ZObject. The byte right after ZObject, 06, I'm not quite sure what this one means. When a variable is serialized, the data belonging to the variable is immediately serialized. If an integer, it'll be writtien as an i then the number.

Now onto something that I noticed, if an object has already been serialized in the same Marshal.dump call, it gets more compressed.

6F 3B 00 06 3B 06 46 - o;..;.F

These last bytes in the file are the 2nd ZObject, so it gets more compressed. What I'm trying to understand here, is how these list of bytes is equivalent to the first serialized ZObject. Other than @data being set to true rather than false. Is it generating a checksum?

So I added a new class with the same variable name and discovered something else. When serializing data, to reduce space, it seems to generate a byte declaring the name of a class, symbol, variable.
New Code (Only serializes two objects, one ZObject and one SObject. Each has a variable named @data)
(click to show/hide)

It generates this. (Pay attention to the highlighted areas)

04 08 5B 07 6F 3A 0C 5A 4F 62 6A 65 63 74 06 3A - ..[.o:.ZObject.:
0A 40 64 61 74 61 54 6F 3A 0C 53 4F 62 6A 65 63 - .@dataTo:.SObjec
74 06 3B 06 46                                  - t.;.F


As you can see, the bytes highlighted in green are @data serialized in the ZObject, however, when declared and serialized in the SObject, they don't come out the same. You can tell they're getting more compressed. I'm still unsure what the byte 06 is identified as (the one that comes right after the last character in a class name).

I'm wondering if anyone can help me figure out or knows anything about how they decide how to compress it.
« Last Edit: December 08, 2013, 06:18:26 AM by gameus »

Offline orochii

  • Transcended Spirit
  • ***
  • Posts: 145
  • LV: 14
  • Rhapsody of the Warrior of Ice
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #1 on: December 08, 2013, 09:33:15 AM »
I'm going to guess about the red part. "3B" means repeating something. For example, when it says 3B 00 it is stating a repetition of a class (probably the previous one). When it says 3B 06 it is meaning to a repetition for a variable (or maybe a boolean?).
It could help dumping something like this:
Code: [Select]
[ZObject.new, SObject.new, ZObject.new, SObject.new]Just to see what happens with the first ZObject name/class/type/something? reference, if it is actually used when writing the second ZObject or recreated.
If it is reused, then we have the second part of our plan: see what reference does the SObject class gets when adding a second one after adding a ZObject.

So, I did that test, defining four objects from 2 different classes in that way. I used your code just to speed up things. Here is the result:
04 08 5B 09 6F 3A 0C 5A 4F 62 6A 65 63 74 06 3A - ..[.o:.ZObject.:
0A 40 64 61 74 61 54 6F 3A 0C 53 4F 62 6A 65 63 - .@dataTo:.SObjec
74 06 3B 06 46 6F 3B 00 06 3B 06 54 6F 3B 07 06 - t.;.Fo;..;.To;..
3B 06 46                                        - ;.F


My assumption is that when reading and writing, it creates a reference list. So for example, it uses 6F for serialized class, 3A for "new reference", and then the reference bytesize. 3B means existing reference, and then throws the reference ID (00 and 07, I wonder what makes it select those numbers).
Same for variables, where it uses 06 for variable (or something?), then 3A or 3B for references.

I wonder what happens when the array has more than 255 different-class objects,
Orochii Zouveleki
« Last Edit: December 08, 2013, 09:34:34 AM by orochii »

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #2 on: December 08, 2013, 11:35:56 AM »
I can actually help here, because I've worked a lot with this format. After all, Ryex and me made a Python implementation.

1. Yes, 4.8 is the format version and it's dumped every time you call Marshal.dump.

2. Integers are compressed. 0x00 is simply 0, 0x05 is 1, 0x7F is 122, 0xFF is -122 and 0xFA is -1. This offset by 5 is there because when an integer is 1, 2, 3 or 4, it means that the following X bytes represent an integer. This is done to save space. e.g. if it's 3, then the following 3 bytes are a little endian encoded integer (e.g FF FF 01 woud be 131071).

3. Yes, 6F (the character 'o') means an object follows.

4. The ':' character after 'o' means String since the next thing is the class name.

5. If there is a ';', it means "This is a reference to an object that was already serialized previously", followed by the ID of the object. e.g. If you serialized 3 objects, the value after ';' would be 0x05, 0x06 or 0x07 (1, 2 or 3). So it's not another object that really gets compressed, it's literally a reference to a previously serialized object.
Just keep in mind that Strings may be put there as reference, but they should always be new objects. e.g. If you write a reader, make sure to save the first occurrence of a String, but always use Object#clone when you need to access it.

6. You already did the test with ZObject and SObject both having a variable called @data. @data is a string in the file and it gets treated like other objects, so it uses the ';' system for references.

7. The value after ZObject is yet another integer, indicating the number of instance variables that follow.



I suggest that you check out this topic: http://forum.chaos-project.com/index.php/topic,11920.0.html
The ARC Data format was inspired by Ruby's Marshal, except that it was simplified. The files end up a bit bigger usually, because there is no hardcore integer compression, but the format is easier and faster to read. It might give you some more insights into serialization in general. I even use a similar format for my Lite Serializer library.
« Last Edit: December 08, 2013, 12:05:40 PM by Blizzard »
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #3 on: December 08, 2013, 05:15:52 PM »
Oh man, that really clears things up Blizzard. Just one more question, how is the id generated for classes/strings to be referenced for later?

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #4 on: December 08, 2013, 07:24:55 PM »
They start at 1 (or 0, I'm not 100% sure anymore) and are incremented for each object and string. So if a reference ID is 2, then it points to the 2nd object that was read from the file.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #5 on: December 08, 2013, 07:27:17 PM »
Ah okay. And I'm assuming they're only given reference IDs if they're found in the file again? In my last test @data is given ID 1 rather than 2. Even though it read ZObject first. But if ZObject had been dumped again, it would have been ID 1?

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #6 on: December 08, 2013, 08:50:34 PM »
Yes. The ID is never written down anywhere, it's implicitly defined by the order of the objects. But it probably is saved somewhere internally during reading/writing. e.g. in ARC Data we simply use an array and use "index + 1" as ID.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #7 on: December 08, 2013, 09:20:56 PM »
Yup, they start at 0. I did another test. So for every string object that gets read, just store an ID and increment it by 1. That way when I run into it, I can just reference it. Thanks for the help Blizzard. I'm gonna be looking into hashes now.

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #8 on: December 08, 2013, 09:57:33 PM »
The same goes actually for strings, arrays, hashes and objects. Just keep in mind that they all increment the counter. If you put a string, then an object, the object will have ID 1, not 0.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #9 on: February 16, 2014, 10:15:53 AM »
So I was poking around with RMXP again and I was looking at it's clipboard data. All it's clipboard data consists of is serialized ruby data. I haven't been able to figure out the first few bytes of information is at it seems to be different from everything I copy. Every piece of data is labeled differently. I was able to grab the data in C# and read some of the basic data just because I had an idea in mind for something. Anyways, I just thought it was kinda cool.

(click to show/hide)

Offline ForeverZer0

  • CP's Pedophile
  • Global Moderator
  • Guardian of Chaos
  • ****
  • Posts: 3255
  • LV: 297
  • Gender: Male
  • Remember you are unique, just like everyone else.
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #10 on: February 16, 2014, 09:24:38 PM »
I actually have all the clipboard data figured out.  During mu absence from the forum, I worked on an editor RMXP in .NET and XNA. Its actually over 90% done, even have a fully working map editor with autotiles, etc. Anyways, I made sure to allow cutting and pasting objects between the Enterbrain editor and my own, and its actually not a very hard thing they did, and I would be happy to show you the source for the clipboard data.

My implementation uses IronRuby for the Marshall, but all the data is simply a Ruby Marshall object. The first few bytes are the number of bytes of the object. If I remember correctly, the only exception is copying map data (using the selector tool on the map to copy/cut a section).  It is a multi-dimensional array, using width, height, layer, and tile IDs.

Here's an example of setting a map event to the clipboard, the "Ruby.MarshalDump" method returns an array of bytes (byte[]):
(click to show/hide)

And here's pasting...
(click to show/hide)

Obviously there are a few methods in there that aren't shown, but you should get the idea. Almost all objects in RMXP use the same way of setting and getting from the clipboard.
I am done scripting for RMXP. I will likely not offer support for even my own scripts anymore, but feel free to ask on the forum, there are plenty of other talented scripters that can help you.

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #11 on: February 16, 2014, 10:03:11 PM »
Thanks F0! I'd love to check out the source code, but my overall goal of this thread was to create my own Ruby (De)Serializer in C#. My side project I had in mind would be a lot easier with IronRuby but my overall goal is to create the library on my own so developers can easily access RMXP's data in their own code.

Regardless, it'd be cool to check out the code. Thanks bud. :3

Offline ForeverZer0

  • CP's Pedophile
  • Global Moderator
  • Guardian of Chaos
  • ****
  • Posts: 3255
  • LV: 297
  • Gender: Male
  • Remember you are unique, just like everyone else.
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #12 on: February 16, 2014, 10:33:01 PM »
From what you say, you could still easily access the data using C# without the need for a custom serializer, just using IronRuby.
In my project, I did use this for a few special instances, specifically the map data. It was a bit cumbersome using a dynamic object, so I created a C# map class that took a dynamic Ruby object in the initializer to create the object.

In each class there is a private field, simply "data" that contains the actual Ruby data, but all the public getters and setters read that data and convert it back and forth between CLR types. For example (not actual code):

(click to show/hide)


I am done scripting for RMXP. I will likely not offer support for even my own scripts anymore, but feel free to ask on the forum, there are plenty of other talented scripters that can help you.

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #13 on: February 16, 2014, 11:05:06 PM »
I'm trying to remember whether we ever implemented a Ruby Marshall reader/writer in Python for ARC. If yes, you can use the source code as a guide to make a C# implementation. Technically you could also download Ruby's source code and take a look at the C code, but it's more complicated due to C's low level.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline Ryex

  • Arctic Bird of Programming
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 5135
  • LV: 198
  • Gender: Male
  • Wants to write a compiler for fun
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #14 on: February 16, 2014, 11:15:33 PM »
we did, but Our understanding of how tables were serialized was flawed at the time and it didn't work so we scraped it. I've been looking for the code for a long time as we used to have it but it's not in the ARC source control. I would of been in the rmpy source control but I went and made sure I purged all of it when I transferred the code to ARC.

Unless you have copy of it somewhere It's lost.
I no longer keep up with posts in the forum very well. If you have a question or comment, about my work, or in general I welcome PM's. if you make a post in one of my threads and I don't reply with in a day or two feel free to PM me and point it out to me.

DropBox, the best free file syncing service there is.
(click to show/hide)

Offline Blizzard

  • This sexy
  • Administrator
  • has over 9000 posts
  • *****
  • Posts: 19982
  • LV: 648
  • Gender: Male
  • Magic midgets.
    • View Profile
    • You're already on it. (-_-')
Re: Reading Ruby's Marshal Format
« Reply #15 on: February 16, 2014, 11:41:13 PM »
Nope, it's lost then.
Check out Daygames and our games:

King of Booze      King of Booze: Never Ever      Pet Bots
Drinking Game for Android      Never have I ever for Android      Pet Bots for Android
Drinking Game for iOS      Never have I ever for iOS      Pet Bots for iOS
Drinking Game on Steam


Quote from: winkio
I do not speak to bricks, either as individuals or in wall form.

Quote from: Barney Stinson
When I get sad, I stop being sad and be awesome instead. True story.

Offline Ryex

  • Arctic Bird of Programming
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 5135
  • LV: 198
  • Gender: Male
  • Wants to write a compiler for fun
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #16 on: February 17, 2014, 01:33:11 AM »
NEVER, say lost. because Hard Drives never forget. even if you delete.

Code: (python) [Select]
from RPG import *
from struct import pack, unpack


#============================================================================================
# RubyMarshal
#--------------------------------------------------------------------------------------------
# This class is able to read and write Ruby Marshal format.
#============================================================================================

class RubyMarshal:
   
    MARSHAL_MAJOR   = 4
    MARSHAL_MINOR   = 8
   
    TYPE_NIL        = '0'
    TYPE_TRUE       = 'T'
    TYPE_FALSE      = 'F'
    TYPE_FIXNUM     = 'i'
   
    TYPE_EXTENDED   = 'e' # not implemented
    TYPE_UCLASS     = 'C' # not implemented
    TYPE_OBJECT     = 'o'
    TYPE_DATA       = 'd' # not implemented
    TYPE_USERDEF    = 'u'
    TYPE_USRMARSHAL = 'U' # not implemented
    TYPE_FLOAT      = 'f' # not implemented
    TYPE_BIGNUM     = 'l'
    TYPE_STRING     = '"'
    TYPE_REGEXP     = '/' # not implemented
    TYPE_ARRAY      = '['
    TYPE_HASH       = '{'
    TYPE_HASH_DEF   = '}' # not implemented
    TYPE_STRUCT     = 'S' # not implemented
    TYPE_MODULE_OLD = 'M' # not implemented
    TYPE_CLASS      = 'c' # not implemented
    TYPE_MODULE     = 'm' # not implemented
   
    TYPE_SYMBOL     = ':'
    TYPE_SYMLINK    = ';'
   
    TYPE_IVAR       = 'I' # not implemented
    TYPE_LINK       = '@' # not implemented

    __Version = "\x04\x08"
    __io = None
    __symbols = []
   
    @staticmethod
    def generate(io):
        pass
   
    @staticmethod
    def dump(object, io):
        pass
   
    @staticmethod
    def load(io):
        RubyMarshal.__io = io
        try:
            major = RubyMarshal.__r_byte()
            minor = RubyMarshal.__r_byte()
            if (major != RubyMarshal.MARSHAL_MAJOR or minor != RubyMarshal.MARSHAL_MINOR):
                raise "incompatible marshal file format (can't be read)\n\
                \tformat version %d.%d required; %d.%d given" %\
                (RubyMarshal.MARSHAL_MAJOR, RubyMarshal.MARSHAL_MINOR, major, minor)
            obj = RubyMarshal.__r_object()
        except:
            raise
        finally:
            RubyMarshal.__io = None
            RubyMarshal.__symbols = []
        return obj
   
    @staticmethod
    def __r_object():
        objectType = chr(RubyMarshal.__r_byte())
        print "type: " + str(objectType)
       
        if objectType == RubyMarshal.TYPE_LINK:
            index = RubyMarshal.__r_long()
            try:
                return RubyMarshal.__symbols[index]
            except:
                raise "dump format error (unlinked %d of %d at 0x%x)" %\
                      (index, len(RubyMarshal.__symbols), RubyMarshal.__io.tell())
            pass
       
        if objectType == RubyMarshal.TYPE_NIL:
            return None
       
        if objectType == RubyMarshal.TYPE_TRUE:
            return True
       
        if objectType == RubyMarshal.TYPE_FALSE:
            return False
       
        if objectType == RubyMarshal.TYPE_FIXNUM:
            return RubyMarshal.__r_long()
       
        if objectType == RubyMarshal.TYPE_BIGNUM:
            sign = (RubyMarshal.__r_byte() == '+')
            data = RubyMarshal.__r_bytes()
            result = 0
            while length > 0:
                shift = 0
                for i in xrange(4):
                    value |= data[i] << shift
                    shift += 8
                length -= 1
            if not sign:
                result = -result
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_STRING:
            result = RubyMarshal.__r_bytes()
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_ARRAY:
            result = RubyMarshal.__r_array()
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_HASH:
            result = RubyMarshal.__r_hash()
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_USERDEF:
            #try:
            result = RubyMarshal.__r_unique()
            result._load(RubyMarshal.__io)
            #RubyMarshal.__r_entry(result)
            return result
            #except:
            #    raise "class %s needs to have method '_load'" % klass
            #pass
       
        if objectType == RubyMarshal.TYPE_OBJECT:
            print "__symbols: " + str(RubyMarshal.__symbols)
            result = RubyMarshal.__r_unique()
            print "result: " + str(result)
            length = RubyMarshal.__r_long()
            attributes = {}
            while (length > 0):
                print "get key"
                key = RubyMarshal.__r_symbol()
                print "get value"
                value = RubyMarshal.__r_object()
                print "key, value: " + str(key) + " " + str(value)
                attributes[key] = value
                length -= 1
            print str(attributes)
            for symbol in attributes.keys():
                setattr(result, symbol.replace("@", ""), attributes[symbol])
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_SYMBOL:
            result = RubyMarshal.__r_symreal()
            RubyMarshal.__r_entry(result)
            return result
       
        if objectType == RubyMarshal.TYPE_SYMLINK:
            result = RubyMarshal.__r_symlink()
            RubyMarshal.__r_entry(result)
            return result
       
        raise "dump format error(0x%x at 0x%x)" % (ord(objectType), RubyMarshal.__io.tell())
   
    @staticmethod
    def __r_byte():
        return ord(RubyMarshal.__io.read(1))
   
    @staticmethod
    def __r_bytes():
        return RubyMarshal.__r_bytes0(RubyMarshal.__r_long())
   
    @staticmethod
    def __r_bytes0(length):
        if (length == 0):
            return ''
        return RubyMarshal.__io.read(length)
   
    @staticmethod
    def __r_array():
        length = RubyMarshal.__r_long()
        result = []
        while (length > 0):
            result.append(RubyMarshal.__r_object())
            length -= 1
        return result
   
    @staticmethod
    def __r_hash():
        length = RubyMarshal.__r_long()
        result = {}
        while (length > 0):
            key = RubyMarshal.__r_object()
            value = RubyMarshal.__r_object()
            try:
                result[key] = value
            except TypeError:
                result[tuple(key)] = value
            length -= 1
        return result
   
    @staticmethod
    def __r_long():
        c = RubyMarshal.__r_byte()
        if c > 127:
            c -= 256
        if (c == 0):
            return 0
        if (c > 0):
            if (4 < c and c < 128):
                return (c - 5)
            result = 0
            for i in xrange(c):
                result |= RubyMarshal.__r_byte() << (8 * i)
            return result
        if (-129 < c and c < -4):
            return (c + 5)
        c = -c
        result = -1
        for i in xrange(c):
            result &= ~(0xFF << (8 * i))
            result |= RubyMarshal.__r_byte() << (8 * i)
        return result
   
    @staticmethod
    def __r_symreal():
        symbol = RubyMarshal.__r_bytes()
        print "symreal: " + str(symbol)
        RubyMarshal.__r_entry(symbol)
        return symbol
   
    @staticmethod
    def __r_symlink():
        index = RubyMarshal.__r_long()
        if index >= len(RubyMarshal.__symbols):
            raise "bad symbol (0x%x)" % RubyMarshal.__io.tell()
        print "symlink: " + str(index) + " " + str(RubyMarshal.__symbols[index])
        return RubyMarshal.__symbols[index]
   
    @staticmethod
    def __r_unique():
        return RubyMarshal.__id2name(RubyMarshal.__r_symbol())
   
    @staticmethod
    def __r_symbol():
        if chr(RubyMarshal.__r_byte()) == RubyMarshal.TYPE_SYMLINK:
            return RubyMarshal.__r_symlink()
        return RubyMarshal.__r_symreal()
   
    @staticmethod
    def __r_entry(value):
        RubyMarshal.__symbols.append(value)
        return value
   
    @staticmethod
    def __id2name(name):
        print "idtoname: " + str(name)
        return eval(name.replace("::", ".") + "()")

   
       


I no longer keep up with posts in the forum very well. If you have a question or comment, about my work, or in general I welcome PM's. if you make a post in one of my threads and I don't reply with in a day or two feel free to PM me and point it out to me.

DropBox, the best free file syncing service there is.
(click to show/hide)

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #17 on: February 17, 2014, 01:52:26 AM »
Mmmmm. This will actually help quite a bit. Thanks for the help guys!

Offline Ryex

  • Arctic Bird of Programming
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 5135
  • LV: 198
  • Gender: Male
  • Wants to write a compiler for fun
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #18 on: February 17, 2014, 01:54:28 AM »
just keep in mind that this was broken when we abandoned it. it work for some files but if it had a table it it it crashed.

EDIT:
also, holly crap, when I rememberd I had kept RMPY on my external drive before I did I format of it, I went and ran recurva on it to see what I could find. I quicly found our old ruby marsh in pyhton back form december 2010. but there was a lot of shit hidden on that drive that I've been missing. like my old FL song projects and my work on PNO. as in old PNO.
« Last Edit: February 17, 2014, 01:58:11 AM by Ryex »
I no longer keep up with posts in the forum very well. If you have a question or comment, about my work, or in general I welcome PM's. if you make a post in one of my threads and I don't reply with in a day or two feel free to PM me and point it out to me.

DropBox, the best free file syncing service there is.
(click to show/hide)

Offline G_G

  • Green Gmod Game_Guy AKA G4 AKA hyper-G AKA G-force
  • Global Moderator
  • Chaos Ultimate
  • ****
  • Posts: 6586
  • LV: 407
  • Gender: Male
    • View Profile
Re: Reading Ruby's Marshal Format
« Reply #19 on: February 17, 2014, 03:58:38 AM »
lol nice. And I'm sure the table can be read somehow. Don't we just have to unpack the data and then read it as if it were a ruby object? Here's vgvgf's Table rewrite that I'd always use when loading RMXP data with Ruby or IronRuby.

Code: [Select]
class Table
  def initialize(x, y = 1, z = 1)
     @xsize, @ysize, @zsize = x, y, z
     @data = Array.new(x * y * z, 0)
  end
  def [](x, y = 0, z = 0)
     @data[x + y * @xsize + z * @xsize * @ysize]
  end
  def []=(*args)
     x = args[0]
     y = args.size > 2 ? args[1] :0
     z = args.size > 3 ? args[2] :0
     v = args.pop
     @data[x + y * @xsize + z * @xsize * @ysize] = v
  end
  def _dump(d = 0)
     s = [3].pack('L')
     s += [@xsize].pack('L') + [@ysize].pack('L') + [@zsize].pack('L')
     s += [@xsize * @ysize * @zsize].pack('L')
     for z in 0...@zsize
        for y in 0...@ysize
           for x in 0...@xsize
              s += [@data[x + y * @xsize + z * @xsize * @ysize]].pack('S')
           end
        end
     end
     s
  end
  def self._load(s)
     size = s[0, 4].unpack('L')[0]
     nx = s[4, 4].unpack('L')[0]
     ny = s[8, 4].unpack('L')[0]
     nz = s[12, 4].unpack('L')[0]
     data = []
     pointer = 20
     loop do
        data.push(*s[pointer, 2].unpack('S'))
        pointer += 2
        break if pointer > s.size - 1
     end
     t = Table.new(nx, ny, nz)
     n = 0
     for z in 0...nz
        for y in 0...ny
           for x in 0...nx
              t[x, y, z] = data[n]
              n += 1
           end
        end
     end
     t
  end
  attr_reader(:xsize, :ysize, :zsize, :data)
end

And as far as I'm concerned, it still read the data just fine.