Wrong Counter Increment

268 views
Skip to first unread message

Xiaochen Zhang

unread,
Mar 19, 2017, 7:40:38 AM3/19/17
to open-nfp
Hi

I defined a table like this, attaching a packet counter to it. "Sampling.rand" is a metadata field where I put the random number from 0 to 9. I've successfully added ten matching rules to the table.
It should work like this: for each packet, there will be a single digit random number put into sampling.rand field. The distribution table will match the value in sampling.rand and increase one out of ten counters by one.
To make the value visible, I send a digest which contains the sampling.rand out.


counter distribution_counter {
    type: packets;
    direct: distribution;
}

table distribution {
    reads {
        sampling.rand: exact;
    }
    actions {
        _nop;
    }
}


The problem is that the counter increment is not correct. For example, the sampling.rand value is 5, but counter[x] increase by one instead of counter[6].
Using index x just to say that keeps changing. Sometimes the right one increase, sometimes the wrong one increase.
I have no idea why it happens like this, is there any problem with my program?

The following is the table entries, matching from 0 to 9:

"distribution" : {
            "rules" : [
                {
                    "name" : "distribution_1",
                    "match" : {
                        "sampling.rand" : { "value" : "0" }
                    },
                    "action" : {
                        "type" : "_nop",
                        "data" : {}
                    }
                },
                {
                    "name" : "distribution_2",
                    "match" : {
                        "sampling.rand" : { "value" : "1" }
                    },
                    "action" : {
                        "type" : "_nop",
                        "data" : {}
                    }
                },



Thanks
Xiaochen

David George

unread,
Mar 20, 2017, 2:15:32 AM3/20/17
to Xiaochen Zhang, open-nfp
Hi Xiaochen

You have hit a tricky problem with lookup caching.

Lookup caching works by identifying a flow key from the control flow and table key field values, and storing the actions for that first occurrence flow in a cache based on the key. Subsequent occurrences of the same flow will execute the same actions from the cache.

In your case, you rightly expect the random table action to be executed, but the action for the first occurrence of the flow is cached and continues to execute.

You can disable the lookup/flow cache; you will find the build option in "advanced" I believe. This will reduce performance dramatically. We have improvements to uncached lookups on the roadmap; for example, exact lookups could be a lot faster. Though, ternary will always be slow and performance won't improve there.

Often one can work around the issue, as in your case you could have the distribution table have only a default rule and use the random value metadata field as the index to the counter. You need to construct things so the the same sequence of actions (and action parameters) can achieve the desired outcome.

There is some documentation about this problem in the context of registers and meters in the release notes and user guide.

David George






Xiaochen Zhang

unread,
Mar 24, 2017, 10:00:42 AM3/24/17
to open-nfp, xz1...@nyu.edu
Hi David

Excuse me, but I still don't  totally understand it.

Lookup caching works by identifying a flow key from the control flow and table key field values, and storing the actions for that first occurrence flow in a cache based on the key. Subsequent occurrences of the same flow will execute the same actions from the cache.

In your case, you rightly expect the random table action to be executed, but the action for the first occurrence of the flow is cached and continues to execute.

The action for the first flow is cached, does it mean the following packets will not be matched any more? The direct counter is associated with each of the table entries, you said lookup is identifying a flow key from the control flow and tables key field values, what is the control flow matching? I still don't get it why the wrong counter will increase.


 
You can disable the lookup/flow cache; you will find the build option in "advanced" I believe. This will reduce performance dramatically. We have improvements to uncached lookups on the roadmap; for example, exact lookups could be a lot faster. Though, ternary will always be slow and performance won't improve there. 

I'm using the CLI instead of GUI windows, so could you give a hint about which configuration file it is?

 

Often one can work around the issue, as in your case you could have the distribution table have only a default rule and use the random value metadata field as the index to the counter. You need to construct things so the the same sequence of actions (and action parameters) can achieve the desired outcome.

There is some documentation about this problem in the context of registers and meters in the release notes and user guide.

If I set the distribution table a default rule, there's no matching key, all packets will hit the rule. Is there only one direct counter associated with the default rule? Or you mean using static counter? How to count the different random values?


Looking forward to hearing from you.
Thanks very much!
Xiaochen

Xiaochen Zhang

unread,
Mar 29, 2017, 1:46:36 PM3/29/17
to open-nfp, xz1...@nyu.edu
Hi David

Thanks for your reply. Now, I got what you mean and after disabling the flowcache, it works properly.

However, there's one more question about the counter problem.
Say I have two tables like this

table Table1 {
    reads {
        ipv4.srcaddr;
    }
    actions {
        add_score;
    }
}

table Table2 {
    reads {
        ipv4.dstaddr;
    }
    actions {
        add_score;
    }
}

action add_score(score_value) {
        add_to_field(score_metadata.score, score_value);
}


I add rules to these two tables using json file, it succeed.
But when I add a counter to table1 like this

counter test_counter {
        type: packets;
        direct: Table1;
}

In the compiling process, there's a warn:

Note: duplicating action add_score due to 'direct' stateful entities



This time, I try to install the same rules using the same json file. It failed.

error: QUALIFIED: Reload of user config failed: failed to process user configuration

Why I can't attach a direct counter to a table that has a action shared by multiple tables? I'm confused about the counter attachment.
Is the relationship "matching field -- counter" or "action -- counter"? If it's former case, two tables using different matching field, the counter should not be affected. If it's the later case, two tables using the same action, maybe it makes sense.

Above is just my assumption. Looking forward to hearing from you.

Thanks
Xiaochen


On Monday, March 20, 2017 at 2:15:32 AM UTC-4, David George wrote:

David George

unread,
Mar 30, 2017, 3:31:54 AM3/30/17
to Xiaochen Zhang, open-nfp
Hi Xiaochen

You should probably have started a new thread, as this is a new problem. Forums are easiest to navigate by title and lumping multiple topics in the same thread makes browsing a pain.


Say I have two tables like this

table Table1 {
    reads {
        ipv4.srcaddr;
    }
    actions {
        add_score;
    }
}

table Table2 {
    reads {
        ipv4.dstaddr;
    }
    actions {
        add_score;
    }
}

action add_score(score_value) {
        add_to_field(score_metadata.score, score_value);
}


I add rules to these two tables using json file, it succeed.
But when I add a counter to table1 like this

counter test_counter {
        type: packets;
        direct: Table1;
}

In the compiling process, there's a warn:

Note: duplicating action add_score due to 'direct' stateful entities



What is happening here is the action is being duplicated, as the direct counter count operation is absorbed into the action code. Thus the need for two flavours of the same action, one per table. In my case the two action names were, add_score and __Table1__add_score. The problem is that runtime interface now expects the action name __Table1__add_score in the json file. This is rather unexpected and inconvenient, and, honestly, a bug. For what its worth, the GUI config editor will put the correct action name in the drop down.

The easiest thing to do would be to manually clone the action, ie declare add_score_table1() and add_score_table2() to avoid this.

Regards,
David

Xiaochen Zhang

unread,
Mar 30, 2017, 7:23:51 PM3/30/17
to open-nfp, xz1...@nyu.edu
Thanks David, it works as you said.

Sorry for the mistake, I should have post it in another topic thread.

Abhishek Dixit

unread,
May 22, 2024, 6:24:28 AM5/22/24
to open-nfp
hey, can you please point me to the config  file which needs to be modified for disabling lookup cache

JJ W

unread,
May 24, 2024, 8:58:07 AM5/24/24
to open-nfp
for example:  --nfp4c_p4_version 16 --no-debug-info -p out -o firmware.nffw -l beryllium -4 l3_basic.p4 -d flowcache

use "-d flowcache" 

Abhishek Dixit

unread,
May 25, 2024, 1:50:15 AM5/25/24
to open-nfp
thank you for responding, but this gives an error of -
"/opt/netronome/p4/components/nfp_pif/me/apps/common/src/fc.c(8) : catastrophic error: could not open source file "flow_cache_global_c.h"
  #include <flow_cache_global_c.h>
"
I believe even you have encountered this error previously. When I try to put -include followed by the path after -d flowcache, the error still remains.

Reply all
Reply to author
Forward
0 new messages