Better debugging with Ruleby

71 views
Skip to first unread message

Andrew Bills

unread,
Sep 26, 2012, 6:01:19 AM9/26/12
to rul...@googlegroups.com
Anyone got any tips and tricks outside of just adding write to logs into ruleby to get more direct errors for debugging? Ive been using ruleby for a pet project and its pretty good although Ive started adding code to clean up old events as well as trying to find tune it for speed etc etc.... and im not sure if its a data error or something ive done but inside the rules engine when I run it with my scheduler now I randomly get:

2012-09-26 09:29:21 UTC - job Rufus::Scheduler::EveryJob_14454640 caught exception 'undefined method `+' for nil:NilClass'

I find the section of code causing it (i think), remove it, doesn't happen anymore. I put the same code back, isnt happening anymore. I leave it for a while, comes back on every pass. I try and run it without the scheduler and it doesnt happen. Driving me slightly nuts.

Joe Kutner

unread,
Sep 26, 2012, 11:54:48 AM9/26/12
to rul...@googlegroups.com
The ruleby internals have a lot of weird indirection that make stuff like this difficult to diagnose.  Can we see the chuck of code that is causing it (which you said you removed and added back in)?

--
You received this message because you are subscribed to the Google Groups "Ruleby" group.
To view this discussion on the web visit https://groups.google.com/d/msg/ruleby/-/9dnduBJYeooJ.
To post to this group, send email to rul...@googlegroups.com.
To unsubscribe from this group, send email to ruleby+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/ruleby?hl=en.

Andrew Bills

unread,
Sep 27, 2012, 3:34:22 AM9/27/12
to rul...@googlegroups.com
Surely can...


I have eliminated the rufus scheduler by adding the same lines over and over again a dozen or so times so its effectively running manually. So its definitely the rules engine part.... this might be a bit complex but let me try:

So in the initialise I created everything:
require 'rufus/scheduler'
require 'RulesEngineCaller'

debug = true
scheduler = Rufus::Scheduler.start_new(:frequency => 1.0)
Rails.logger = Logger.new("rule-log.txt", 'daily')

Rails.logger.debug "#{Time.now.utc} - Created Notification Object"
MSG_NOTIFY = Notification.new()

rule_engine = Ruleby::Core::Engine.new
Rails.logger.debug "#{Time.now.utc} - Created New Rules Engine"
RULES_ENG = RulesEngineCaller.new(rule_engine)
unless ( File.basename($0) == "rake" && ARGV.include?("db:migrate") )
  EngineRulebook.new(rule_engine).rules

  if debug == true
    sleep 30
    RULES_ENG.rebuild_engine
    RULES_ENG.run_engine
  else
    #initial data load in case no rules are changed at start
    scheduler.in('30s', :allow_overlapping => false) do
      Rails.logger.debug "#{Time.now.utc} - Added Rules & Facts to New Rules Engine"
      ActiveRecord::Base.connection_pool.with_connection do
        RULES_ENG.rebuild_engine
      end
    end

    #run re-occuring per schedule in config
    scheduler.every(CONFIG[:core_settings][:run_engine_time], :allow_overlapping => false) do
      ActiveRecord::Base.connection_pool.with_connection do
        RULES_ENG.run_engine
      end
    end
    scheduler.every(CONFIG[:core_settings][:forced_engine_restart], :allow_overlapping => false) do
      ActiveRecord::Base.connection_pool.with_connection do
        RULES_ENG.rebuild_engine
      end
    end
  end
end

def scheduler.handle_exception(job, exception)
  Rails.logger.debug "#{Time.now.utc} - job #{job.job_id} caught exception '#{exception}'"
end

Then I have my rules engine caller where I build the engine and do the running:

def rebuild_engine
    Rails.logger.debug "#{Time.now.utc} - Acquiring Mutex"
    @rules_mutex.synchronize {
      Rails.logger.debug "#{Time.now.utc} - Mutex Acquired"
      #get rid of all terminated events on a reload
      @terminated_event = Event.find_all_by_terminate_flag('1')
      @terminated_event.each do |ended_ticket|
        Event.where(:ticket_id => ended_ticket.ticket_id, :source => ended_ticket.source).destroy_all
      end

      #get rid of all 'records' older than the retention date
      @old_record = Record.where("updated_at < ?", CONFIG[:core_settings][:event_history_length].to_i.days.ago)
      @old_record.each do |old_ticket|
        Record.destroy(old_ticket)
      end

      #get rid of all events that arent used by any rules
      #@invalid_source_events = Event.where(:source != Rule.select(:source).uniq.source)
      #@invalid_source_events.each do |invalid_source_ticket|
        #Rails.logger.info "No Rule Source Match, Deleting Record: ticket_id - #{invalid_source_ticket.ticket_id} | source - #{invalid_source_ticket.source} | cust region - #{invalid_source_ticket.cust_region} | description - #{invalid_source_ticket.description} | created at - #{invalid_source_ticket.created_at}"
        #Event.destroy(invalid_source_ticket)
      #end

      @rule_engine = Ruleby::Core::Engine.new
      EngineRulebook.new(@rule_engine).rules

      @events = Event.all
      Rails.logger.debug "#{Time.now.utc} - Started loading Events"
      @events.each do |event|
        @rule_engine.assert event
      end
      Rails.logger.debug "#{Time.now.utc} - Finished loading Events"
      Rails.logger.debug "#{Time.now.utc} - Releasing Mutex"
    }
    Rails.logger.debug "#{Time.now.utc} - Mutex Released"
  end

  def run_engine
    Rails.logger.debug "#{Time.now.utc} - Run Engine"
    d = Time.now.utc
    puts "#{Time.now.utc} - Run Engine"
    Rails.logger.debug "#{Time.now.utc} - Acquiring Mutex"
    @rules_mutex.synchronize {
      Rails.logger.debug "#{Time.now.utc} - Mutex Acquired"
      @rule_engine.match

      #hard coded alert condition if running engine takes too long
      if (Time.now.utc - d) > CONFIG[:core_settings][:threshold].to_i
        @fail_count = @fail_count + 1
        Rails.logger.debug "#{Time.now.utc} - Processing above set threshold #{Time.now.utc - d} seconds, fail count #{@fail_count}"
        if @fail_count == 5
          MSG_NOTIFY.notify_group(1, "Alert Processing above set threshold x5, last processing #{Time.now.utc - d} seconds")
        elsif @fail_count == 30
          MSG_NOTIFY.notify_group(1, "Alert Processing above set threshold x30, last processing #{Time.now.utc - d} seconds")
        elsif @fail_count == 60
          MSG_NOTIFY.notify_group(1, "Alert Processing above set threshold x60, last processing #{Time.now.utc - d} seconds")
          @fail_count = 0
        end
      else
        @fail_count = 0
      end
      Rails.logger.debug "#{Time.now.utc} - Releasing Mutex"
      }
    Rails.logger.debug "#{Time.now.utc} - Mutex Released"
    Rails.logger.debug "#{Time.now.utc} - Finished Engine Match #{Time.now.utc - d} "
    @last_processing_timestamp = Time.now.utc
    @last_processing_duration = Time.now.utc - d

    #add values to an array for use with tracking history
    if @last_processing_duration_history.length > 20
      @last_processing_duration_history.shift
      @last_processing_duration_history << @last_processing_duration
    else
      @last_processing_duration_history << @last_processing_duration
    end
  end

Ok so now you have the bits that explain (wtf) im doing in some of my code here is the rules engine part where I load in the rules programatically....

require_relative 'notification'
require 'ruleby'
include Ruleby

class EngineRulebook < Rulebook
  def rules
    Rails.logger.debug "#{Time.now.utc} - Loading Rules into Rulebook"
    #msg = Notification.new()
    @rules = Rule.all

    @rules.each do |rule_name|
      case
        ##### Rules for no other text & no ctc #####
        # milestone 1 =, other text na, ctc na, denomination not used
        when rule_name.milestone1_operator == "=" && rule_name.other_text_operator == "" && rule_name.ctc_id_operator == "" && rule_name.milestone1_time_value_denomination == ""
          rule [Event, :m, m.milestone == rule_name.milestone1_value, m.source =~ /#{rule_name.source}|^$/, m.cust_no =~ /#{rule_name.cust_no}|^$/, m.call_type =~ /#{rule_name.call_type}|^$/, m.priority =~ /#{rule_name.priority}|^$/, m.group_owner =~ /#{rule_name.group_owner}|^$/, m.entitlement_code =~ /#{rule_name.entitlement_code}|^$/ ] do |v|
            #check if rule has already fired
            if v[:m].rules.include?(rule_name) == false
              #output matched rule to console & logfile
              Rails.logger.info "#{Time.now.utc} - match rule #{rule_name.id} #{rule_name.title} - #{v[:m].ticket_id} - #{v[:m].description}"

              #do actual alert
              MSG_NOTIFY.notify_user(rule_name.id, v[:m].id)

              #add reference so rule doesn't fire again
              @event = Event.find_by_id(v[:m].id)
              @event.rules << rule_name
              v[:m].rules << rule_name
              modify v[:m]
            end
            #if updating a bunch of records last updated, then do it here
          end
        #milestone1 =, no other text, no ctc, duration based ###### THE SECTION BELOW IS THE PROBLEM#######
        when rule_name.milestone1_operator == "=" && rule_name.other_text_operator == "" && rule_name.ctc_id_operator == "" && (rule_name.milestone1_time_value_denomination == "H" || rule_name.milestone1_time_value_denomination == "D")
          rule [Event, :m, m.milestone == rule_name.milestone1_value, m.milestone_type =~ /^D|^S/,m.source =~ /#{rule_name.source}|^$/, m.cust_no =~ /#{rule_name.cust_no}|^$/, m.call_type =~ /#{rule_name.call_type}|^$/, m.priority =~ /#{rule_name.priority}|^$/, m.group_owner =~ /#{rule_name.group_owner}|^$/, m.entitlement_code =~ /#{rule_name.entitlement_code}|^$/ ] do |v|
            #check if rule has already fired
            if v[:m].rules.include?(rule_name) == false
              #find the last event
              @last_event = Event.where(:ticket_id => v[:m].ticket_id, :source => v[:m].source, :milestone_type => ['D', 'S']).order(:time_stamp).select(:id).last
              #check current event against the last duration type event
              if v[:m].id == @last_event.id
                if rule_name.milestone1_time_value_denomination == "H"
                  #convert hours to minutes
                  target_minutes = (rule_name.milestone1_time_value * 60).to_int
                  milestone1_target_time = (v[:m].time_stamp + target_minutes.minutes)
                elsif rule_name.milestone1_time_value_denomination == "D"
                  #convert days to minutes
                  target_minutes = (rule_name.milestone1_time_value * 60 * 24).to_int
                  milestone1_target_time = (v[:m].time_stamp + target_minutes.minutes)
                else
                  #erroneous data error, do not trigger but record in log
                  milestone1_target_time = Time.now.utc - 5.years
                  Rails.logger.warn "#{Time.now.utc} - DATA ERROR in #{rule_name.id} #{rule_name.title} - #{v[:m].ticket_id} - #{v[:m].description}"
                end

                #check time now against the target specified
                if Time.now.utc >= milestone1_target_time
                  #matches condition
                  #output matched rule to console & logfile
                  Rails.logger.info "#{Time.now.utc} - match rule #{rule_name.id} #{rule_name.title} - #{v[:m].ticket_id} - #{v[:m].description}"

                  #do actual alert
                  MSG_NOTIFY.notify_user(rule_name.id, v[:m].id)

                  #add reference so rule doesn't fire again
                  @event = Event.find_by_id(v[:m].id)
                  @event.rules << rule_name
                  v[:m].rules << rule_name
                  modify v[:m]
                else
                  #do nothing, time hasn't expired yet - allows for check on next cycle
                end
              else
                #not the last event for duration calculation
                #add reference so rule doesn't fire again
                @event = Event.find_by_id(v[:m].id)
                @event.rules << rule_name
                v[:m].rules << rule_name
                modify v[:m]
              end
            end
          end


So I build rules in a form, then i use a big case statement to load them in programatically. I try and make the rules engine do most of the heavy lifting but I have to draw the line somewhere so some extra checking is done to see if everything is true after the rules engine says its right. Then I add some values towards the end to allow the fact not to be re-run on the same rule but allow the fact to match other rules which it might use. I'm thinking I'm going to add some sort of timestamp value to facts that are used for assessment so i can delete facts that aren't used and maybe speed it up some more. Currently I'm holding about 150 rules and using 56000 facts from a live feed. This is super cool stuff and it was working awesomely until I did something (im assuming) I did discard some records which aren't referred to in any of the above code as part of a a scheduled cleanup (Records) is something I keep to tell me who recieved what notifications and had what matches per users who want to know particular rules.

Its a little complicated... if the ruleby gave me more debugging info I could figure it out, currently without the rufus scheduler the error doesnt come up but it definitely resets and tries to start again so its in sort of a loop. Pulling out the bit I mention makes it work but then I dont have a section I need.

Andrew Bills

unread,
Sep 28, 2012, 6:18:44 AM9/28/12
to rul...@googlegroups.com
I also changed 


rule [Event, :m, m.milestone == rule_name.milestone1_value, m.milestone_type =~ /^D|^S/,m.source =~ /#{rule_name.source}|^$/, m.cust_no =~ /#{rule_name.cust_no}|^$/, m.call_type =~ /#{rule_name.call_type}|^$/, m.priority =~ /#{rule_name.priority}|^$/, m.group_owner =~ /#{rule_name.group_owner}|^$/, m.entitlement_code =~ /#{rule_name.entitlement_code}|^$/ ] do |v|

to

rule [Event, :m, m.milestone == rule_name.milestone1_value ] do |v|

Im certain it doesnt like a particular fact, if I knew which fact it was I could probably work backwards to figure out the problem.

Matt Smith

unread,
Sep 28, 2012, 8:15:28 AM9/28/12
to rul...@googlegroups.com
Hi Andrew,

The original exception:  caught exception 'undefined method `+' for nil:NilClass'

would suggest that a piece of the data is nil.  From the code you have shown the only possible causes for this error would be:
  @fail_count = @fail_count + 1 
or
  v[:m].time_stamp + target_minutes.minutes

I would check the initialization of @fail_count and check Event.time_stamp values for nil.

Might not be the problem, but I would look there first.

Thanks,

Matt
--
You received this message because you are subscribed to the Google Groups "Ruleby" group.
To view this discussion on the web visit https://groups.google.com/d/msg/ruleby/-/z2laa2EcwREJ.

Joe Kutner

unread,
Sep 28, 2012, 9:54:17 AM9/28/12
to rul...@googlegroups.com
its also possible that v[:m] is nil.  If that's the case, then it may be a Ruleby bug.  But I don't see how that could happen because there is a if v[:m].id == @last_event.id guard around that statement.

Andrew Bills

unread,
Oct 1, 2012, 9:57:16 PM10/1/12
to rul...@googlegroups.com
More playing around told me its literally this line (as a #do nothing with it has the error, removing it does not have the error).

          rule [Event, :m, m.milestone == rule_name.milestone1_value, m.milestone_type =~ /^D|^S/,m.source =~ /#{rule_name.source}|^$/, m.cust_no =~ /#{rule_name.cust_no}|^$/, m.call_type =~ /#{rule_name.call_type}|^$/, m.priority =~ /#{rule_name.priority}|^$/, m.group_owner =~ /#{rule_name.group_owner}|^$/, m.entitlement_code =~ /#{rule_name.entitlement_code}|^$/ ] do |v|

As if I leave it in and comment everything else out, I can get the error to come up. I was thinking about limiting the amount of stuff I'm loading in as events using select, thinking perhaps there is something there... but that's more a guess than anything. I "destroyed" some Records which are associated with Users prior to this happening (they only keep record of who was notified what), Users is related to Groups, Groups have Rules, Rules can be associated with Events. I feel like its a leap for that to be the cause even if it started happening around when I did that. I am tempted to empty out the DB entirely (except for the rules, groups & users) and let it repopulate... but of course when I want to clean up old records I might get this again if it IS the cause.

@fail_count shouldn't come into play as the engine is called when I do @rule_engine.match just before that, so fail count is just me tracking how many times it runs longer than I think it should (as its scheduled to run every x minutes and I don't want it tripping up on itself, had that didnt end well). Plus, works fine without that line in it.

Is there an upper limit for Ruleby? Could I just be getting to the point of it being confused?

Andrew Bills

unread,
Oct 16, 2012, 2:09:06 AM10/16/12
to rul...@googlegroups.com
Id still like better debugging because otherwise I am just guessing.... but if I was guessing and I go with its relating to when Ruleby gets Events.all asserted into it perhaps there is elements that are not directly associated. So what am I doing wrong in the below perhaps?

This is the code:

So just for ease of reading model 1 - records:

class Record < ActiveRecord::Base
  attr_accessible :email, :message, :source, :rule_id_ref, :boxcar_notify, :nma_notify,     :nmwp_notify, :mobile_ph_notify, :im_notify, :email_notify, :salesforce_notify, :prowl_notify
  belongs_to :user
end

model 2 - users:

class User < ActiveRecord::Base
  require 'notification'
  #before_create :create_rss_token - removed may use authentication token instead
  before_create :default_config

    rolify
  # Include default devise modules. Others available are:
  # :confirmable,
  # :lockable, :timeoutable and :omniauthable
 devise :invitable, :database_authenticatable, :registerable, :confirmable,
     :recoverable, :rememberable, :trackable, :validatable, :token_authenticatable

  # Setup accessible (or protected) attributes for your model
  attr_accessible :name, :email, :password, :password_confirmation, :remember_me,     :confirmed_at, :group_ids, :country_iso, :use_email_flag, :use_boxcar_flag, :use_mobile_ph_flag, :use_im_flag, :boxcar_id, :mobile_phone_no, :business_days, :business_hrs_start, :business_hrs_end, :timezone, :use_nma_flag, :nma_api_key, :use_nmwp_flag, :nmwp_api_key, :use_prowl_flag, :prowl_api_key, :authentication_token, :event_records_attributes, :role_ids
  has_and_belongs_to_many :groups
  has_many :records

I also have events which relates to rules which relates to groups which relates to users...

What I have done is deleted the old "records" with the following:

@terminated_event = Event.find_all_by_terminate_flag('1')
  @terminated_event.each do |ended_ticket|
    Event.where(:ticket_id => ended_ticket.ticket_id, :source => ended_ticket.source).destroy_all
  end

  #get rid of all 'records' older than the retention date
  @old_record = Record.where("updated_at < ?", CONFIG[:core_settings][:event_history_length].to_i.days.ago)
  @old_record.each do |old_ticket|
    Record.destroy(old_ticket)
  end

So in guessing either, Im doing something wrong in the above that when I "destroy" a record there is something left over that the rules engine is trying to reference.... or I need better debugging :)





Reply all
Reply to author
Forward
0 new messages