Using the sFlow-RT Rest API and some general doubts.

1,160 views
Skip to first unread message

Richard Mayers

unread,
Dec 1, 2015, 1:13:10 PM12/1/15
to sFlow-RT
Hi Peter and everybody,

I am doing a loadbalancing project and I would like to use sFlow RT to "estimate" or somehow know in real-time the size of all the flows crossing the vSwitch (OVS). In the webpage you present a lot of examples for DoS detection in which you consider if a flow is bigger than 10% of the link bandwidth is considered a big flow. For that you set the sampling rate taking into account the link capacity and that big flows are 10%. Could you give me some hits, or just tell me which would be the best sampling rate, or strategy to somehow detect the size of all flows. Of course its more important to be more accurate with big flows, but I want to find a trade off so I can also detect "small or medium flows", without losing a lot of accuracy with the big ones.

Could you also give me or point me in the documentation how to define flows so I can get from the rest api sflow data for all the flows going to the output X, and being udp and tcp.?

Thank you in advance!!

kind regards,
Richard mayers

Peter Phaal

unread,
Dec 1, 2015, 1:29:25 PM12/1/15
to sFlow-RT
The following reference material describes the parameters used to define flows, http://sflow-rt.com/define_flow.php

Many of the examples use a threshold to alert on large flows (to load balance, mark, or block). You should set the byFlow:true flag when defining the threshold to ensure that all flows exceeding the threshold generate events.

You can also use the /activeflows/json REST API call, or activeFlows() embedded JavaScript function in an interval handler to poll the topN flows seen by OVS.

The following article suggests sampling rates that should be used, http://blog.sflow.com/2013/06/large-flow-detection.html. In the case of OVS, the sampling rate applies to the whole bridge, so you should pick a rate as a function of the host network adapter speed.

You can trade off the speed of detection against accuracy for smaller flows by adjusting the value of t when you define the flow. t is the window in seconds over which the traffic rate is calculated. You can also set log:true in the flow definition to get a log of all the expired flows, http://blog.sflow.com/2013/08/restflow.html

Richard Mayers

unread,
Dec 1, 2015, 5:55:08 PM12/1/15
to sFlow-RT

Thanks for your answer, Peter.

Now I was doing some tests, and something strange happened. I set the sampling rate to 1:1 (I know its not good high speeds links), however I wanted to see if I was able to detect all the flows with sFlow.  I don't know what happened but it was not detecting a ping, but it was detecting "bigger"flows. Should not detect everything with 1:1 ? I also tried with 400 flows (small) crossing the switch and when I was reading the activeflows entry it was giving me maximum 40~50 flows (I set maxFlows =400), why? I would need to see almost all of them to know whats happening and be able to loadbalance properly. Most surprisingly, after 5 minutes the ovs switch stopped working and I needed to kill the process ....

A question that has nothing to do with that: If I set t = 1 when I define the flows, the flow rate will go to 0 instantaneously  when I stop transmitting it ? Because right know when I stop, the rate decreases but quite slowly, I guess its because the window you said right ?

Kind regards,
Richard

Peter Phaal

unread,
Dec 1, 2015, 7:27:49 PM12/1/15
to Richard Mayers, sFlow-RT
sFlow-RT maintains a flow cache per switch port. The size of this
cache is a function of the n: value in the flow definition (a maximum
value of 20 is allowed). The activeFlows query accumulates data from
all the individual flow caches and so can return more than 20 results,
but a flow will not be represented unless it is in the top n in at
least one interface.

Why do you need to see all the flows to load balance properly? If you
define a flow with no keys, but with bytes as the value, you will get
a real-time view of the total traffic flowing in each port. If the
loads are unbalanced then you can look for large flows to move.
Unbalanced loads tend to be the result of large "Elephant" flows:
http://blog.sflow.com/2013/02/sdn-and-large-flows.html

If the traffic consists of many small flows then the hash based load
balancing mechanism used by OVS (and by physical switches) to balance
traffic across ECMP / LAG groups will generally be effective.

The following article includes a demonstration of large flow load balancing:
http://blog.sflow.com/2015/06/leaf-and-spine-traffic-engineering.html

The sFlow-RT Fabric View application gives an idea of the type of
analytics that you might want to use for load balancing:
http://blog.sflow.com/2015/10/fabric-view.html

Are you using Open vSwitch with Mininet? If so, you might want to look
at the following example as well:
http://blog.sflow.com/2015/01/hybrid-openflow-ecmp-testbed.html
> --
> You received this message because you are subscribed to the Google Groups
> "sFlow-RT" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sflow-rt+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Message has been deleted

Richard Mayers

unread,
Dec 1, 2015, 7:58:50 PM12/1/15
to sFlow-RT, richard....@gmail.com
Thanks for the answer again,


sFlow-RT maintains a flow cache per switch port. The size of this
cache is a function of the n: value in the flow definition (a maximum
value of 20 is allowed). The activeFlows query accumulates data from
all the individual flow caches and so can return more than 20 results,
but a flow will not be represented unless it is in the top n in at
least one interface.

    Well, actually I want to know only the flows that go to some specific output. So the maximum I can monitor is 20? Is there a way I can see more ? I will explain below in the next paragraph why.  

Why do you need to see all the flows to load balance properly? If you
define a flow with no keys, but with bytes as the value, you will get
a real-time view of the total traffic flowing in each port.  If the
loads are unbalanced then you can look for large flows to move.
Unbalanced loads tend to be the result of large "Elephant" flows:
http://blog.sflow.com/2013/02/sdn-and-large-flows.html


If the traffic consists of many small flows then the hash based load
balancing mechanism used by OVS (and by physical switches) to balance
traffic across ECMP / LAG groups will generally be effective.

Well the problem is that I did not explain you how do I do load balancing. I don't want to load balance in the switch. The idea is to combine a SDN switch with regular cisco routers. I am not using mininet, I have my own setup with switches and routers. Lets say I have the following topology (it can be extended to something more complex between the two switches):

H1 -- Switch 1----  router1 ----- router 2 -----switch 2-----H2.

Between the router1 and router 2 there are 4 interfaces with the same cost so they use ECMP. The idea behind my project is to loadbalance between the two routers, the problem with ECMP is that depending on the flows and if they are not a lot it can give poor results. What I am doing is the following I am monitoring whats happening between the two routers with netflow, so I know through which interface each flow goes. What I do with the OVS switch is to modify the header of some packets in the switch 1 so they go through the link I want to, and then at switch 2 i restore the original header and H2 will not notice any difference. With that approach I can reach 25%, 25%, 25%, 25% or 100%,0,0,0, whatever I want thats the good thing. However, the problem is that with netflow I can get flow data every minute and thats so bad because I can only load balance big flows, and also 1 minute or more late.

So reading your blog I saw sFlow and I wanted to combine everything together: with netflow I learn to which link evey flow goes so I learn how to move them, then to read the % of each link between the routers I use SNMP and I read the counters every second. And finally my idea was to use sFlow to learn flows sizes and headers much faster than with netflow.  Is that clear or I confused you ?

The following article includes a demonstration of large flow load balancing:
http://blog.sflow.com/2015/06/leaf-and-spine-traffic-engineering.html

The sFlow-RT Fabric View application gives an idea of the type of
analytics that you might want to use for load balancing:
http://blog.sflow.com/2015/10/fabric-view.html

Are you using Open vSwitch with Mininet? If so, you might want to look
at the following example as well:
http://blog.sflow.com/2015/01/hybrid-openflow-ecmp-testbed.html

As I said, no. I am using OVS from command line plus dynamips/dynagen.  

Peter Phaal

unread,
Dec 2, 2015, 11:23:28 AM12/2/15
to sFlow-RT, richard....@gmail.com
Thanks for the additional details - I have added comments inline.

On Tuesday, December 1, 2015 at 4:58:50 PM UTC-8, Richard Mayers wrote:
Thanks for the answer again,

sFlow-RT maintains a flow cache per switch port. The size of this
cache is a function of the n: value in the flow definition (a maximum
value of 20 is allowed). The activeFlows query accumulates data from
all the individual flow caches and so can return more than 20 results,
but a flow will not be represented unless it is in the top n in at
least one interface.

    Well, actually I want to know only the flows that go to some specific output. So the maximum I can monitor is 20? Is there a way I can see more ? I will explain below in the next paragraph why.  

I am still not sure why you need more than 20 flows per vNIC. 

If you are trying to reconcile the data with NetFlow, then you could enable logging and you will see all the flows that sFlow-RT sees (not just the top 20).
 

Well the problem is that I did not explain you how do I do load balancing. I don't want to load balance in the switch. The idea is to combine a SDN switch with regular cisco routers. I am not using mininet, I have my own setup with switches and routers. Lets say I have the following topology (it can be extended to something more complex between the two switches):

H1 -- Switch 1----  router1 ----- router 2 -----switch 2-----H2.

Between the router1 and router 2 there are 4 interfaces with the same cost so they use ECMP. The idea behind my project is to loadbalance between the two routers, the problem with ECMP is that depending on the flows and if they are not a lot it can give poor results. What I am doing is the following I am monitoring whats happening between the two routers with netflow, so I know through which interface each flow goes. What I do with the OVS switch is to modify the header of some packets in the switch 1 so they go through the link I want to, and then at switch 2 i restore the original header and H2 will not notice any difference. With that approach I can reach 25%, 25%, 25%, 25% or 100%,0,0,0, whatever I want thats the good thing. However, the problem is that with netflow I can get flow data every minute and thats so bad because I can only load balance big flows, and also 1 minute or more late.

So reading your blog I saw sFlow and I wanted to combine everything together: with netflow I learn to which link evey flow goes so I learn how to move them, then to read the % of each link between the routers I use SNMP and I read the counters every second. And finally my idea was to use sFlow to learn flows sizes and headers much faster than with netflow.  Is that clear or I confused you ?

The setup you describe is very similar to the segment routing load balancer I mentioned:


In the segment routing case, the ingress top of rack switch is used to add MPLS labels to selected flows. In your case, you could use Open vSwitch to add the MPLS labels, the net result would be the same.

Flows don't need to correspond to TCP connections, you could define them to group traffic between pairs of hypervisors, subnets, etc. As a practical matter trying to act on small flows doesn't scale because there are too many, you need to group traffic into a smaller number of larger flows.

Since you are using simulated Cisco devices (dynamips), would it be possible to add sFlow export? Cisco Nexus 9k/3k switches support sFlow. The poor visibility offered by SNMP and NetFlow is going to make load balancing impractical with realistic workloads. The primary issue is the latency that these technologies add to the measurements.


You need a comprehensive, low latency set of measurements to create stable feedback control:


The following talk from last year's Open vSwitch conference discusses the visibility requirements for feedback control:

 

Richard Mayers

unread,
Dec 2, 2015, 12:55:08 PM12/2/15
to sFlow-RT, richard....@gmail.com
Hi Again Peter, and thank you for you nice answers!  my comments are inline again.


El miércoles, 2 de diciembre de 2015, 17:23:28 (UTC+1), Peter Phaal escribió:
Thanks for the additional details - I have added comments inline.

On Tuesday, December 1, 2015 at 4:58:50 PM UTC-8, Richard Mayers wrote:
Thanks for the answer again,

sFlow-RT maintains a flow cache per switch port. The size of this
cache is a function of the n: value in the flow definition (a maximum
value of 20 is allowed). The activeFlows query accumulates data from
all the individual flow caches and so can return more than 20 results,
but a flow will not be represented unless it is in the top n in at
least one interface.

    Well, actually I want to know only the flows that go to some specific output. So the maximum I can monitor is 20? Is there a way I can see more ? I will explain below in the next paragraph why.  

I am still not sure why you need more than 20 flows per vNIC. 

If you are trying to reconcile the data with NetFlow, then you could enable logging and you will see all the flows that sFlow-RT sees (not just the top 20).

I want all of them because I want to know the header of all the current flows that go to a specific output. Can I do that with the logging option ? If I log them how can I differentiate between an "old flow" or a currently ongoing flow that is below the top 20? 

I totally agree with you that only the top 20 flows will be necessary for the load balancing, since the small flows will be balanced with the ECMP. However as I said, for that project I need to be able to move the load through all the flows and not not only balance 25% of the load in case of 4 links between the routers. For that I need to be able to move all the flows and I need to know the header of them for the matching rules at the sdn switch.  

Well the problem is that I did not explain you how do I do load balancing. I don't want to load balance in the switch. The idea is to combine a SDN switch with regular cisco routers. I am not using mininet, I have my own setup with switches and routers. Lets say I have the following topology (it can be extended to something more complex between the two switches):

H1 -- Switch 1----  router1 ----- router 2 -----switch 2-----H2.

Between the router1 and router 2 there are 4 interfaces with the same cost so they use ECMP. The idea behind my project is to loadbalance between the two routers, the problem with ECMP is that depending on the flows and if they are not a lot it can give poor results. What I am doing is the following I am monitoring whats happening between the two routers with netflow, so I know through which interface each flow goes. What I do with the OVS switch is to modify the header of some packets in the switch 1 so they go through the link I want to, and then at switch 2 i restore the original header and H2 will not notice any difference. With that approach I can reach 25%, 25%, 25%, 25% or 100%,0,0,0, whatever I want thats the good thing. However, the problem is that with netflow I can get flow data every minute and thats so bad because I can only load balance big flows, and also 1 minute or more late.

So reading your blog I saw sFlow and I wanted to combine everything together: with netflow I learn to which link evey flow goes so I learn how to move them, then to read the % of each link between the routers I use SNMP and I read the counters every second. And finally my idea was to use sFlow to learn flows sizes and headers much faster than with netflow.  Is that clear or I confused you ?

The setup you describe is very similar to the segment routing load balancer I mentioned:


In the segment routing case, the ingress top of rack switch is used to add MPLS labels to selected flows. In your case, you could use Open vSwitch to add the MPLS labels, the net result would be the same.

Flows don't need to correspond to TCP connections, you could define them to group traffic between pairs of hypervisors, subnets, etc. As a practical matter trying to act on small flows doesn't scale because there are too many, you need to group traffic into a smaller number of larger flows.

Since you are using simulated Cisco devices (dynamips), would it be possible to add sFlow export? Cisco Nexus 9k/3k switches support sFlow. The poor visibility offered by SNMP and NetFlow is going to make load balancing impractical with realistic workloads. The primary issue is the latency that these technologies add to the measurements.

I need to make the assumption that the cisco router does not support sFlow (only few cisco routers support it). So the idea I mentioned is to use netflow only to learn where is a concrete flow (ipsrc, ipdst, and ports) going to be sent (in the case of 4 links between the routers I just want to learn were is going to be hashed 0,1,2,3, so I can use that information to move flows putting to them the same ip and ports). Then from the routers also I want to read counters to see how good I am loadbalancing, do I have another option, if I can not use sFlow in the routers?. And finally I want to use sFlow to learn which are the current flows and its sizes so I can install rules in the switch that modify the tuple (ipsrc, ipdst, srcport, and destport), Therefore, I do not care if netflow is slow, just care about the accuracy of sFlow and SNMP counters. 

Peter Phaal

unread,
Dec 2, 2015, 1:50:25 PM12/2/15
to Richard Mayers, sFlow-RT
I don't believe you will be able to reliably infer the hash function
in the Cisco routers by observing the NetFlow records. In addition,
there is no guarantee that the hash function used in the dynamips
emulation is representative of the hardware hash function used in
Cisco ASICs.

In the dynamips case, you could probably save yourself a lot of effort
by finding the hash function in the source code - you could then use
this to validate to see how accurately your NetFlow based inference
is, or eliminate the need for NetFlow by using the function directly.

If the hash function that the routers use for link selection was
known, then the right approach would be to apply the function in the
sFlow-RT flow definition (using a Key Function
http://sflow-rt.com/define_flow.php#keyfunctions ). Something along
the lines:
modulo:[cisco_hash:ipsource:ipdestination:tcpsourceport:tcpdestinationport]:4

This way you would only need four flows, the traffic to each ECMP
member. However, hash functions are generally proprietary and vendors
do not publicly disclose them and so this approach is not very
practical and so it is unlikely that we would add support for it in
sFlow-RT.

The dynamips emulation might be misleading you about the accuracy of
the SNMP counters. In most physical switches and routers the counters
are copied from hardware to software every 5 to 10 seconds and so
polling at 1 second will not give you timely utilization data. sFlow
packet samples provide much faster updates.

I am not sure that it is possible to construct a stable load balancer
given the lack of direct, real-time visibility in the routers.
Measurement is fundamental to building feedback loops and the best
solution is to select devices that support sFlow.

Richard Mayers

unread,
Dec 2, 2015, 6:56:39 PM12/2/15
to sFlow-RT, richard....@gmail.com
I don't want to know how the hash function works, actually I don't care I just want to learn some of the outputs of the current hash at the beginning of my loadbalancing, and then knowing that some flow is hashed to some link I will use its ip and ports values to hash another flow I want to move to that link.

Anyway I will try to use sFlow in the switch at least to roughly learn how big are the incoming flows. 

Sorry for bothering again, but I still have several questions regarding some data I retrieve using the REST API. And after reading the documentation and doing some tests to somehow get some insights I still have doubts.

1) You told me to define Flows with log = True, so they are stored  when they are completed (as says the documentation). So if they are finished, what's the meaning of the field value, in my case bytes.  What is that number indicating ? The maximum, the average ? I don't see how can I use that if they are all flows.

2)  If I understood you correctly, sFlow-RT maintains a flow cache per switch port, and the maximum size is 20 flows. Then you said that activeFlows query accumulates individual flow caches and so can return more than 20 results, I quite don't understand that. For example I am generating 200 flows of exactly the same size and when I do a get to .../activeflows/agent/flowname/json I get a maximum of 100 different "flows", how can I get more than 20 ? And why is 100 the maximum?

3) As I said in point 2), I am generating 200 flows that are exactly of the same size, netflow reports them as the same size, but when I do the get activeflows sFlow reports them as if they were quite different the difference between the biggest and smallest is quite huge. I guess that's because of the sampling, but the flows are quite small like 500Kbit/s each one and I am sampling at 1:10, what would be the best strategy to see them more evenly, so when I do the load balancing I can know they are similar sizes and not like know that some seem really small. 

4) In one of my previous messages I told you that ovs crashed, so actually doing my tests with the 200 flows I am talking about during this message, it crashed again. How can it be possible ? The flows are quite small and before enabling the sFlow agent it never happened, and I was sending more flows than 200...

Sorry for my long questions, you are helping me a lot!.  I haven't said before, but this work is for my master's thesis and I am a little bit lost.

Kind regards

Peter Phaal

unread,
Dec 2, 2015, 7:43:07 PM12/2/15
to Richard Mayers, sFlow-RT
Answers inline.

On Wed, Dec 2, 2015 at 3:56 PM, Richard Mayers
<richard....@gmail.com> wrote:
> 1) You told me to define Flows with log = True, so they are stored when
> they are completed (as says the documentation). So if they are finished,
> what's the meaning of the field value, in my case bytes. What is that
> number indicating ? The maximum, the average ? I don't see how can I use
> that if they are all flows.

The logged flow records are just like NetFlow. Total bytes or frames,
start time, end time, and the set of keys. Here is an example from
http://blog.sflow.com/2013/08/restflow.html

{
"agent": "10.0.0.20",
"dataSource": "5",
"end": 1377658681678,
"flowID": 249,
"flowKeys": "10.0.0.20,10.0.0.236,47571,3260",
"name": "tcp",
"start": 1377658613678,
"value": 1217600
}

>
> 2) If I understood you correctly, sFlow-RT maintains a flow cache per
> switch port, and the maximum size is 20 flows. Then you said that
> activeFlows query accumulates individual flow caches and so can return more
> than 20 results, I quite don't understand that. For example I am generating
> 200 flows of exactly the same size and when I do a get to
> .../activeflows/agent/flowname/json I get a maximum of 100 different
> "flows", how can I get more than 20 ? And why is 100 the maximum?

You control the maximum number of returned flows using the maxFlows
argument to the activeFlows query. The default value is 100.

http://sflow-rt.com/reference.php#rest

> 3) As I said in point 2), I am generating 200 flows that are exactly of the
> same size, netflow reports them as the same size, but when I do the get
> activeflows sFlow reports them as if they were quite different the
> difference between the biggest and smallest is quite huge. I guess that's
> because of the sampling, but the flows are quite small like 500Kbit/s each
> one and I am sampling at 1:10, what would be the best strategy to see them
> more evenly, so when I do the load balancing I can know they are similar
> sizes and not like know that some seem really small.

NetFlow reports the total bytes over the duration of the flow, so when
you divide by the flow duration you get an average rate. sFlow-RT
reports the current rate (bytes/second) for each flow and you would
expect the rate to vary over the lifetime of the flow, particularly
when you have many flows competing for the same network resources.

What is the duration of each flow you are generating? Also, as you
point out, sampling will introduce some noise to the signal,
particularly if the flows are all low data rate.

I should also note that sFlow-RT is designed to identify large flows.
The activeflows query returns the largest N flows. If you have a flat
distribution with hundreds of equal size flows then activeflows will
just return a random selection of flows.

> 4) In one of my previous messages I told you that ovs crashed, so actually
> doing my tests with the 200 flows I am talking about during this message, it
> crashed again. How can it be possible ? The flows are quite small and before
> enabling the sFlow agent it never happened, and I was sending more flows
> than 200...

What version of Open vSwitch are you using? I would recommend using
the latest version since there have been significant performance and
stability enhancements in the last couple of years.

Richard Mayers

unread,
Dec 2, 2015, 8:16:55 PM12/2/15
to sFlow-RT, richard....@gmail.com
Comments inline.

They were lets say "infinite", I was just doing some testing with 200 constant flows, to see if I was able to detect them as a more or less "same size flows", but then suddenly the switch stop forwarding packets, and I could not even reset them or execute commands (ovs-vsctl, ofctrl) I needed to kill the process. 

I should also note that sFlow-RT is designed to identify large flows.
The activeflows query returns the largest N flows. If you have a flat
distribution with hundreds of equal size flows then activeflows will
just return a random selection of flows.

There is something I don't quite get about that, and I am confused since the beginning. What's the relation between N and  the maxFlows 
argument you can use with the activeflows command. If I am sending 200 flows, and I set maxFlows = 200, then I will see all of them, right ? You also said at some point that 20 was the maximum, then why am I able to see 100? They all go to the same port.

> 4) In one of my previous messages I told you that ovs crashed, so actually
> doing my tests with the 200 flows I am talking about during this message, it
> crashed again. How can it be possible ? The flows are quite small and before
> enabling the sFlow agent it never happened, and I was sending more flows
> than 200...

What version of Open vSwitch are you using? I would recommend using
the latest version since there have been significant performance and
stability enhancements in the last couple of years.

I am using 2.3.2, in the webpage they report its a LTS, I will install 2.4 to see if there is any difference.  

Thanks.

Peter Phaal

unread,
Dec 3, 2015, 1:14:15 AM12/3/15
to Richard Mayers, sFlow-RT
On Wed, Dec 2, 2015 at 5:16 PM, Richard Mayers
<richard....@gmail.com> wrote:
> There is something I don't quite get about that, and I am confused since the
> beginning. What's the relation between N and the maxFlows
> argument you can use with the activeflows command. If I am sending 200
> flows, and I set maxFlows = 200, then I will see all of them, right ? You
> also said at some point that 20 was the maximum, then why am I able to see
> 100? They all go to the same port.

A separate flow cache is maintained for each switch port. Each flow
flow cache accurately tracks up to 20 large flows. However, if there
are no large flows then the top 20 will just be a random selection.
There may be additional flows in the cache that haven't yet been
flushed as well.

The total number of flows being tracked is 20 * number of switch
ports. The activeflows query finds the largest flows in all the
caches. The result size is limited to a maximum of maxFlows entries.

Richard Mayers

unread,
Dec 3, 2015, 11:15:10 AM12/3/15
to sFlow-RT, richard....@gmail.com
If the active timeout for my flows is for example 10 seconds, why after minutes a finished flow is still being reported by the activeflow query and if I use the logged flows query value = 0.

Flow definition :

{u'udpFlows': {u'activeTimeout': 10,
  u'filter': u'outputifindex=39',
  u'flowStart': True,
  u'fs': u',',
  u'keys': u'ipsource,ipdestination,udpsourceport,udpdestinationport',
  u'log': True,
  u'n': 20,
  u't': 1,
  u'value': u'bytes'}}

activeflow answer: 

[{u'agent': u'127.0.0.1',
  u'dataSource': u'36',
  u'flowN': 1,
  u'key': u'84.68.163.16,1.0.3.2,1464,33336',
  u'value': 6.417671266887294e-229}]

flows answer: why value is 0 ?

[{u'agent': u'127.0.0.1',
  u'dataSource': u'36',
  u'end': 1449158507095,
  u'flowID': 0,
  u'flowKeys': u'84.68.163.16,1.0.3.2,1464,33336',
  u'name': u'udpFlows',
  u'start': 1449158507095,
  u'value': 0}]


I want to be able to see the total size in bytes of the flow as fast as possible.  

Is there a way to erase the logged flows database?

Kind regards



Peter Phaal

unread,
Dec 3, 2015, 11:54:59 AM12/3/15
to Richard Mayers, sFlow-RT
On Thu, Dec 3, 2015 at 8:15 AM, Richard Mayers
<richard....@gmail.com> wrote:
> If the active timeout for my flows is for example 10 seconds, why after
> minutes a finished flow is still being reported by the activeflow query and
> if I use the logged flows query value = 0.
>
> Flow definition :
>
> {u'udpFlows': {u'activeTimeout': 10,
> u'filter': u'outputifindex=39',
> u'flowStart': True,
> u'fs': u',',
> u'keys': u'ipsource,ipdestination,udpsourceport,udpdestinationport',
> u'log': True,
> u'n': 20,
> u't': 1,
> u'value': u'bytes'}}
>
> activeflow answer:
>
> [{u'agent': u'127.0.0.1',
> u'dataSource': u'36',
> u'flowN': 1,
> u'key': u'84.68.163.16,1.0.3.2,1464,33336',
> u'value': 6.417671266887294e-229}]

The activeTimeout has no effect on the calculation of rates returned
by the /activeflows query (or a /metric query).

sFlow-RT calculates an exponential moving average based on the value
of t in the flow definition,
https://en.wikipedia.org/wiki/Moving_average. The algorithm generates
a long tail. However, the value of 6e-229 is essentially zero. If you
want to eliminate these insignificant results, set the minValue
argument in the /activeflows query, e.g. minValue=0.1 will only return
flows where the value is greater than 0.1

>
> flows answer: why value is 0 ?
>
> [{u'agent': u'127.0.0.1',
> u'dataSource': u'36',
> u'end': 1449158507095,
> u'flowID': 0,
> u'flowKeys': u'84.68.163.16,1.0.3.2,1464,33336',
> u'name': u'udpFlows',
> u'start': 1449158507095,
> u'value': 0}]

The value is zero because there were no bytes transferred for this
flow. The record was generated because there is still an entry in the
flow cache.

> I want to be able to see the total size in bytes of the flow as fast as
> possible.

sFlow-RT rate values give you an instantaneous estimate of the current
bytes/second or packets/second associated with each flow.

Total bytes or frames measurement lag because they are aggregated over
an interval and reported at the end of the interval (the
activeTimeout).

For load balancing you probably want to be using the rates reported by
/activeflows or /metric since this will give you the fastest
measurement. Actually, the fastest possible way to respond to a large
flow is to set a threshold since you will be asynchronously notified
the instant the traffic rate crosses the threshold.

The logged flow records are more useful for traffic accounting and to
provide a log of flows for security use cases.

> Is there a way to erase the logged flows database?

You can't erase the flow log but you can filter based on the flowID to
eliminate older values when you make a /flows query. The
extras/tail_flows.py uses flowID to track the latest flows.

Deleting the flow definition will remove the associated active flow
caches. You can then add the flow definition again and you will have
cleared the state.

Richard Mayers

unread,
Dec 9, 2015, 9:05:48 PM12/9/15
to sFlow-RT, richard....@gmail.com

Hi peter,

 
Thank you for all your previous answers, they were pretty useful !


I have a new doubt. I am generating 100 flows of the same size with iperf, and then I am reading the sflow data every second. I have the parameter "t" in the definition of flow at 3, I tried different ones 1,2,3,4... I also tried with different sampling rates but in the most extreme case I set sampling 1:1 in the switch. 

My problem is, that every second when I read the sflow data all the flows are oscillating the value too much ... how can this be happening if I am doing sampling 1:1 ? If I am sampling everything how is possible that the range of bit/rates reported by sflow is so different the biggest value is 2~3 times bigger than the smaller. 

Do you have any solution to mitigate this problem ?

Kind regards,
Richard  
 

Peter Phaal

unread,
Dec 9, 2015, 11:13:31 PM12/9/15
to Richard Mayers, sFlow-RT
Network flows are never smooth. The finer the timescale the more
variable the data rate. At a short enough time scale, link utilization
is 0% or 100%, any intermediate values are the result over averaging
over an interval.

Sampling at 1-in-1 doesn't necessarily improve accuracy since the high
rate of packet samples is likely to overload the switch and result in
clipping (lost samples) and bias.

If you look at all the steps between a sample being taken in the
dataplane, queued to the sFlow agent, marshaled in an sFlow UDP
datagram, enqueued in the senders IP stack, transmitted across the
network, enqueued in the collector's IP stack and finally processed by
the analyzer application, there is variability in timing that affects
rate calculations.

Higher t values average over longer intervals and reduce variability,
but at the expense of responsiveness. The correct tradeoff of
averaging and sampling rates depends on your use case.

Tasbiha Fatima

unread,
Dec 20, 2019, 9:49:36 AM12/20/19
to sFlow-RT
Hi Peter

I am doing a project on ONOS and want to connect sflow-rt to my mininet. I have already installedt sflow-rt but don't know how to connect to its gui. Please help

Peter Phaal

unread,
Dec 20, 2019, 10:00:16 AM12/20/19
to sFlow-RT
You access the sFlow-RT REST API and applications via http on port 8008, e.g. http://localhost:8008/

By default, sFlow-RT doesn't ship with much in the way of GUI functionality (there is a REST API explorer and a dashboard showing the health of the software). You should load applications such as mininet-dashboard, flow-trend, and browse-metrics if you want to explore the available data.


Additional applications are listed at https://sflow-rt.com/download.php#applications and instructions for writing your own applications at https://sflow-rt.com/writing_applications.php

Peter Phaal

unread,
Dec 20, 2019, 1:12:13 PM12/20/19
to Tasbiha Fatima, sFlow-RT
Please don't drop the group in your replies.

The screen shot you sent is from sFlow-RT 2.3. The latest version of sFlow-RT is 3.0, which simplified the core platform by moving UI functionality out of the base software into applications:


On Fri, Dec 20, 2019 at 8:38 AM Tasbiha Fatima <tasbi...@gmail.com> wrote:
Thanks Peter it worked. But I have one more question. Can you please tell how this dashboard can be accessed as shown in the screenshot?

image.png

--
You received this message because you are subscribed to the Google Groups "sFlow-RT" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sflow-rt+u...@googlegroups.com.

Tasbiha Fatima

unread,
Dec 20, 2019, 1:46:46 PM12/20/19
to Peter Phaal, sFlow-RT
Thanks a lot. 
Reply all
Reply to author
Forward
0 new messages