Is it possible to serialize an ActiveRecord object?

22 views
Skip to first unread message

Daniel Baez

unread,
Dec 3, 2019, 2:05:47 PM12/3/19
to Ruby on Rails: Talk
Hello,

Lets say that I have a model class like `Car < ActiveRecord::Base` and then, I want to extend it with an abstract class in between a sort of `Car < CacheableActiveRecord::Base < ActiveRecord::Base`

In `CacheableActiveRecord::Base` I'll overwrite methods like `find_by`, `find_by!` and `find` with the intention of checking out Redis/Memcached before calling super and letting Rails resolve this to the underlying db, I'm using Rails 4.2.7.1

By extending `Car` an `after_commit` hook will be created for each instance of the `Car` class, that hook will clear the cached key for this model upon commit, This is to get the effect, of releasing the cache on save.

This is an implementation for my `find` method in `CacheableActiveRecord::Base`

```
    def self.find(*ids)
      expects_array = ids.first.kind_of?(Array)
      return ids.first if expects_array && ids.first.empty?

      ids = ids.flatten.compact.uniq

      case ids.size
      when 0
        super # calling supper will generate an exception at this point
      when 1
        id = ids.first

        result = get_from_cache(id)
        unless result
          result = super # I expect this to get the data from db
          if result
            result = set_to_cache(id, result) 
          end
        end

        expects_array ? [ result ] : result
      else
        super # for now let's just skip on multiple ids
      end
    rescue RangeError
      raise RecordNotFound, "Couldn't find #{@klass.name} with an out of range ID"
    end
```

I have tests for this, and they even seem to work. for the implementation of `set_to_cache` and `get_from_cache` I have the following code

```
    def self.get_from_cache(entity_id)
      return nil unless entity_id

      cached = Rc.get!(get_cache_key(entity_id))
      if cached
        puts "cache hit"
      end

      return cached ? Marshal.load(cached) : nil
    end

    def self.set_to_cache(entity_id, entity)
      return nil unless entity && entity_id

      dump = Marshal.dump(entity) 

      Rc.set!(get_cache_key(entity_id), dump)

      return Marshal.load(dump) 
    end
```

My doubts here are:

  • Is `Marshal` safe? can one do this? taking into account that this cache will be shared among instances of the same rails app, running in different servers.
  • Is it better to serialize to json? is it possible to serialize to json and then rebuilt an object that will work as regular active record object? I mean one on which you can call `.where` queries on related objects and so on?
I'm a newbie to Rails and Ruby, started coding with both two weeks ago, my background is mostly java xD (finding Ruby great BY THE WAY hahaha)

Thank you

Daniel


Adam Lassek

unread,
Dec 3, 2019, 3:27:26 PM12/3/19
to Ruby on Rails: Talk
It might be *possible* but I would strongly suggest rethinking this.

What is going wrong with your DB that its built-in caching is insufficient?
Have you exhausted your performance-tuning options in the DB layer?
Have you measured a significant performance problem in the first place?
Do you *really* need to serialize the entire record or could you cache just what you need?
Have you considered the difficulty of keeping two data stores in sync?

If you can confidently answer all of those questions and you still want to pursue this design, then I would begin again with a data-mapper pattern instead.

Daniel Baez

unread,
Dec 3, 2019, 5:55:24 PM12/3/19
to Ruby on Rails: Talk
Hi Adam, thanks for answering

About the questions:
    • What is going wrong with your DB that its built-in caching is insufficient?
      • Load, there are millions of queries running a sec now, and the Active Model classes I want to update are read constantly but hardly ever written
    • Have you exhausted your performance-tuning options in the DB layer?
      • I'm not an expert on this, but I understand that yeah performance tuning only got that far
    • Have you measured a significant performance problem in the first place?
      • This investigation is looking at the future, I'll need to serve 20times more traffic
    • Do you *really* need to serialize the entire record or could you cache just what you need?
      • I'm actually writing services in another language, that could fetch this data from the same cache, reads are all concentrated in my Rails app
      • So I'm actually looking forward to serialize a "row" to Thrift/Avro/Proto, save it somewhere, and then use it read it from there (redis/memcached) instead of issuing queries for data that doesn't change
      • I'm after two goals
        • Improve performance of my rails monolith
        • Make this data available for other services 
    • Have you considered the difficulty of keeping two data stores in sync?
      • I've done this in the past, exact same idea but on a custom MVC without ORM and it worked perfectly, improved performance, actually the system later on, faced a point by which cache invalidations became problematic: the system couldn't operate any longer without it's cache warm enough
    Besides these questions, given that I do want to persue a design like this

    - Can I plug this `data-mapper` anywhere in rails? Can I override a few methods and then delegate the rest back to the DB for a particular model?
    - The idea of using `Marshal` I got it from browsing `Rails.cache` code, where functions to serialize/deserialize can be defined, or they are defaulted to `Marshal.load` and `Marshal.dump`. `Rails.cache` documentation doesn't say anything about not doing something like:

    ```
    class CarCache   
    def get_car_from_cache_or_db(id)
        Rails.cache.fetch("cars.#{id}") do
          Car.find(id)
        end
      end
    end ```

    And then changing my application's call sites to `Car.find` for `CarCache.get_car_from_cache_or_db`

    Right?

    San Ji

    unread,
    Dec 3, 2019, 7:37:31 PM12/3/19
    to Ruby on Rails: Talk
    Most of the time, as Adam suggested, the database would do this for you already.

    But if it does not works, your table is just so large; it won't fit in the memory. (I guess)

    If the situation allow, try adding more memory to the database instances.
    If the size of your data is not in the same ballpark as a server memory, congratulations, your business are great.
    In that case, I would try shading and using KV store instead of relational database.

    I never face something at that scale; I guess it depends on application also.

    Not much I can input here but you can try searching articles related to Twitter, they have the same problem in the past, and they were open about it, you should find their materials easily.

    Jim

    unread,
    Dec 5, 2019, 8:28:29 AM12/5/19
    to Ruby on Rails: Talk
    On Tuesday, December 3, 2019 at 5:55:24 PM UTC-5, Daniel Baez wrote:
    • What is going wrong with your DB that its built-in caching is insufficient?
      • Load, there are millions of queries running a sec now, and the Active Model classes I want to update are read constantly but hardly ever written
    Have you tested to be sure that Marshal.load is significantly faster than ActiveRecord instantiation from a query? Have you verified that the database queries are a significant portion of the time to handle the request?

    Have you tried using Rails view caching? View caching also saves you the view rendering time, which is often the majority of the time spent handling a request.  For example, here's a typical page load from one of my apps: Completed 200 OK in 35ms (Views: 25.6ms | ActiveRecord: 4.3ms). At millions of requests per second, I'd gain almost nothing by reducing the ActiveRecord time from 4.3ms to 1ms, compared to using the various elements of view caching that are already well supported and documented.


    Reply all
    Reply to author
    Forward
    0 new messages