[P4-design] new language features for "5G" and "edge" use cases

48 views
Skip to first unread message

Gergely Pongracz

unread,
Jun 1, 2021, 8:56:46 AM6/1/21
to p4-d...@lists.p4.org

Hi,

 

On the P4 WS we presented a use case that is I admit not very unique these days (implementing 5G network functions with P4), but while we were implementing these NFs we came across a few limitations in P4 and started to wonder whether some new features solving these issues could be part of P4 in the future.

 

There are a few slides on these: https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf

 

Basically there are the following 3 cases:

  1. Short-term (~1 RTT) buffering: this would be handy for segmentation-reassembly and retransmit loop use cases. Basically some “buffer”, and “remove” actions would be needed preferably with timer support
  2. More generic buffering and time-based events: for programmable traffic management, packet scheduling or keepalive messages we could use a more generic method. Possibly the problem is very similar to the previous on the API level, but in these cases there is probably less limitations on the buffer size, which could make this a bit more tricky from hardware development perspective
  3. Modular pipelines: this would mean that we could specify multiple pipelines and a “master” pipeline that would call the underlying ones as subroutines. Perhaps more than 2 layers could be allowed, e.g. a “master” can also be re-used as a module by a higher layer pipeline. Probably this could be solved entirely on the compiler level.

 

We also described a few less important cases (e.g. re-using tables, registers, etc.), but we could easily find workarounds to overcome those limitations, so let’s focus on the ones above.

 

I’m quite new to this community although we’ve been using P4 for a while now. So I don’t really know what is the best way to start discussing these issues and if you find these useful, how to start working on (some of) these.

Thanks for any hints and help.

BR,

 

Gergely

 

Nate Foster

unread,
Jun 1, 2021, 9:51:14 AM6/1/21
to p4-d...@lists.p4.org
Hi Gergely,

The best way to get involved is to attend the monthly P4 LDWG meeting. For bringing proposals, we have a process discussed on the README.md for the p4-spec repository (https://github.com/p4lang/p4-spec). Basically we are happy to entertain a high-level proposal, resulting in a thumbs up or thumbs down. For detailed proposals, we expect to see a number of things fully worked out: specification language changes, prototype implementation, and example programs. Suffice to say, this is a lot of work, so it's great to either be certain you want to pursue it and are able to follow through, or you can convince others to help you out.

Responding to these topics:
1. Does this need a language change or could these buffers be modeled with an extern? If so, then no language change is needed, just an architecture that supports buffering.
2. Is the idea the same as Ibanez et al.'s notion of events (https://dl.acm.org/doi/10.1145/3365609.3365848)? If not, what language changes would you propose?
3. Simple forms of modularity can be accomplished by treating controls as composable units. Note that they support constructors with statically-determined parameters. Otherwise, have you looked at Lyra and MicroP4 from SIGCOMM '20? (Minor: we have shifted to inclusive technology, so I recommend using the name "primary" and not the one you used.)

-N

_______________________________________________
P4-design mailing list -- p4-d...@lists.p4.org
To unsubscribe send an email to p4-desi...@lists.p4.org

Nate Foster

unread,
Jun 1, 2021, 9:53:00 AM6/1/21
to p4-d...@lists.p4.org
ps. When exploring a language design, it's extremely helpful to look at concrete examples (and not just abstract features).

If you have code for some 5G use cases worked out, even in hypothetical extensions of P4, those would be great to see.

-N

Mihai Budiu

unread,
Jun 1, 2021, 1:35:06 PM6/1/21
to p4-d...@lists.p4.org

BTW: the next design meeting is taking place Monday June 7 at 1:30PM Pacific Time.

You can join the meeting here: https://VMware.zoom.us/j/5366328548?pwd=dVhmMlZJQnl2czkvdENUVnIrdjBKdz09

I realize this may not be a good time for you; in that case we can organize a special session at a different time where you can present your proposals.

 

Mihai

Gergely Pongracz

unread,
Jun 4, 2021, 5:13:42 AM6/4/21
to p4-d...@lists.p4.org

Hi Nate,

 

Sorry for the long delay. I uploaded our example code here: https://github.com/P4ELTE/use_cases/tree/master/p4-16/bst

The main code is cpf_ran.p4. In theory it supports both TNA, PSA and V1 – we used Tofino’s compiler and t4p4s (basically p4c with DPDK backend) for compiling and running.

 

Buffering would be executed in the RANDownlink() control block, now we add a special header and send out the packet towards the service IP of the BaaS (buffer-as-a-service) which runs as a Kubernetes service for now. We could clone the packet just as well and send it directly to the downlink path while sending the copy to the buffer, but now it is sent to the buffer and on successful buffering the BaaS service returns the packet – this way we know that the original packet is buffered and timeout counter is started.

 

Regarding to your questions:

  1. You are right, maybe it could be solved by an extern similarly as we solve it with a non-P4 component. On the other hand I don’t particularly like having too much architectures around as that kills one of the main advantages of P4 (to my knowledge) which is portability. So I’d rather go for a language change with this – for me the only reason not doing that could be if the task would be impossible to support by some hardware targets. You know the language much better, but I’d say buffering a few packets could be similar to having a bit more registers. So buffering itself doesn’t seem a huge issue for me. Running timers and assigning events to them on the other hand might be a bigger change as potentially there would be a large amount of parallel timers – and of course there are good data structures for that, but are they hardware friendly enough? Ibanez’s presentation suggests it can be done fairly simply on FPGA.
  2. According to the presentation I think the proposed solution – especially if all proposed primitives on slide 6 would be implemented – is a superset of what we’d need (I’d say for us enqueue, dequeue and timer expiration would be enough). So if Ibanez’s proposal would be part of the language, we wouldn’t need more (at least for now).
  3. Yes, if you have a look at the code you’ll see that we already use control blocks for modularizing the code. With Tofino sometimes it’s not straightforward as the compiler tends to use more stages in this case compared to if you use less control blocks (this issue was also mentioned in the uP4 talk). As I understood, Lyra is a higher layer solution for portability over multiple DSLs, so I guess that would be handy if even in the long term portability would be an issue. I think Lyra’s composition part could deal with composing multiple modules / programs on a single switch – I guess you referred to this feature, but I don’t think we’d need a Lyra-like engine in the long run.

 

So my only question that is remaining: is the proposals from Ibanez & co. already considered by some of the working groups e.g. LDWG? If yes, I’ll go thru the details as that is quite likely a good solution for us too.

Thanks!

 

Gergely

 

 

 

From: Nate Foster <jnfo...@cs.cornell.edu>
Sent: Tuesday, June 1, 2021 3:51 PM
To: Gergely Pongracz <Gergely....@ericsson.com>
Cc: p4-design <p4-d...@lists.p4.org>

hem...@mnkcg.com

unread,
Jun 4, 2021, 12:34:18 PM6/4/21
to p4-d...@lists.p4.org

Ibanez’s proposal was implemented using Xilinx proprietary P4toFPGA SDNet tools chain.  Just like multiple timers are easy to support on FPGA, the Linux user space can support them easily as well.  Both DPDK and VPP (Fd.io) run in Linux user space.  The p4c bmv2 backend also runs in Linux user space.  The PNA (Portable NIC Architecture) has also asked for timers/events.   I am not sure if any p4 lang changes to adopt Ibanez’s proposal – the proposal uses enqueue/deque blocks and metadata to supports event driven processing.  I think a new architecture model for event driven processing could be added to the p4 spec. Thereafter, implement p4c changes for the p4c UBPF or bmv2 backend.  The UBPF backend is simpler to modify.

 

Since I work with p4c, I can try to help with p4c changes.  On Monday, June 7th we have a P4 LDWG meeting we could discuss what I have mentioned above.

 

Thanks and best wishes,

 

Hemant

 

 

From: Gergely Pongracz via P4-design <p4-d...@lists.p4.org>
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster <jnfo...@cs.cornell.edu>
Cc: p4-design <p4-d...@lists.p4.org>

Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

hem...@mnkcg.com

unread,
Jun 4, 2021, 3:30:18 PM6/4/21
to p4-d...@lists.p4.org

If one sees Gergely’s slides (https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf), he has also asked for conditional emit in deparser.  p4c has already added an if condition to the P4 parser during this year.  We should discuss use of conditional in deparser, at least for certain low speed architectures.

 

Thanks,

 

Hemant

 

From: Gergely Pongracz via P4-design <p4-d...@lists.p4.org>
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster <jnfo...@cs.cornell.edu>
Cc: p4-design <p4-d...@lists.p4.org>

Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

thomas....@gmail.com

unread,
Jun 4, 2021, 4:47:07 PM6/4/21
to p4-d...@lists.p4.org

Hi Hemant,

I am also interested in conditional emit inside the deparser. Currently, I am working on ways to add P4c passes which simplify the deparsing graph by looking at header validity. Those passes are using conditional statements to achieve the simplification.
I would be glad to help contribute to this topic.

Thanks,

Thomas

hem...@mnkcg.com

unread,
Jun 4, 2021, 4:53:38 PM6/4/21
to p4-d...@lists.p4.org

Mihai,

 

I was only echoing the ask from Gergely slides.  Indeed, I too have written code such that I used the if-condition in egress control and set a header to invalid.  So, I don’t see a pressing need to use if condition in deparser.

 

Hemant

 

From: Mihai Budiu <mbu...@vmware.com>
Sent: Friday, June 04, 2021 4:49 PM
To: hem...@mnkcg.com; Gergely....@ericsson.com; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

First, please note that P4 does not limit deparser to just emit statements; a deparser is a general control, and it can certainly have if statements. Only *targets* (i.e., backends) may limit what the deparser can do. I will submit a PR against the p4c-xdp target to show that this is supported by P4.

 

Second, for architectures like Tofino, a statement like this in the deparser:

 

If (condition) packet.emit(h);

 

Can be rewritten as two statements:

 

  1. In egress, towards the end:

 

if (!condition)

   h.setInvalid();

 

  1. And in deparser an unconditional emit:

 

packet.emit(h);

 

For a specific architecture we could even make this transformation automatically.

 

Mihai

 

From: hem...@mnkcg.com <hem...@mnkcg.com>
Sent: Friday, June 4, 2021 1:18 PM
To: Mihai Budiu <mbu...@vmware.com>; Gergely....@ericsson.com; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

I understand, an invalid header is not emitted.  However, what if I want an if-condition such as if (bla-bla), then emit a header.

 

Thanks,

 

Hemant

 

From: Mihai Budiu <mbu...@vmware.com>
Sent: Friday, June 04, 2021 4:02 PM
To: hem...@mnkcg.com; Gergely....@ericsson.com; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Why is a conditional emit necessary? Invalid headers are not emitted.

 

Mihai

hem...@mnkcg.com

unread,
Jun 6, 2021, 4:05:32 PM6/6/21
to p4-d...@lists.p4.org

 

In case anyone doesn’t have the Ibanez paper, it is attached.

 

Hemant

 

From: Hemant Singh via P4-design <p4-d...@lists.p4.org>
Sent: Friday, June 04, 2021 12:34 PM
To: Gergely....@ericsson.com; jnfo...@cs.cornell.edu

ibanez2019.pdf

Nate Foster

unread,
Jun 6, 2021, 4:24:04 PM6/6/21
to p4-d...@lists.p4.org
Let’s not get in the habit of redistributing other people’s materials on this list. 

Better to link to the ACM Digital Library, ACM SICOMM sponsored open-access versions, or the author’s copy on their webpages — none of these run afoul of copyright law.

-N

hem...@mnkcg.com

unread,
Jun 6, 2021, 5:09:54 PM6/6/21
to p4-d...@lists.p4.org

This is the website where the paper is at:

 

https://sibanez12.github.io/docs/event-driven-pisa.pdf

 

I sent out the paper thinking the Ibanez proposal would be discussed in tomorrow’s LDWG meeting and having the paper handy helps.

 

Hemant

Mihai Budiu

unread,
Jun 6, 2021, 5:12:38 PM6/6/21
to p4-d...@lists.p4.org
No meeting tomorrow 


From: Hemant Singh via P4-design <p4-d...@lists.p4.org>

hem...@mnkcg.com

unread,
Jun 6, 2021, 6:29:36 PM6/6/21
to p4-d...@lists.p4.org

Thanks, Mihai.

 

Incidentally, each TCP session needs 5 timers, so supporting 10 million flows requires up to 50 million concurrent timers.  Such flows could exist on a single high end server machine. Some food for thought since we are looking into timer and events.

 

Hemant

Gergely Pongracz

unread,
Jun 7, 2021, 5:29:47 AM6/7/21
to p4-d...@lists.p4.org

Hi,

 

Guys, don’t get stuck on this, as I indicated in the paper this is a minor issue, we solved it exactly as Mihai proposed below.

BR,

 

Gergely

hem...@mnkcg.com

unread,
Jun 7, 2021, 9:38:56 AM6/7/21
to p4-d...@lists.p4.org

One example of use of if-condition in deparser is below:

 

https://github.com/barefootnetworks/Open-Tofino/blob/master/p4-examples/p4_16_programs/tna_mirror/tna_mirror.p4#L220

 

TNA supports if-statements in the deparser.

hem...@mnkcg.com

unread,
Jun 7, 2021, 9:52:44 AM6/7/21
to p4-d...@lists.p4.org

Gergely,

 

Sounds good.  I filed two new issues against the use_cases code.  Please look into the issues and get back.

 

https://github.com/P4ELTE/use_cases/issues/3

https://github.com/P4ELTE/use_cases/issues/2

 

I’d be happy to contribute to code changes since I have programmed the Tofino and also bmv2 and ebpf p4c targets.

 

Thanks,

 

Hemant

Mihai Budiu

unread,
Jun 7, 2021, 1:54:54 PM6/7/21
to p4-d...@lists.p4.org

Section 15.1 of the spec says:

 

  • It is illegal to invoke emit on an expression whose type is a base type, enum, or error.

 

If we had emit for base types this could work very well. I see no problem why that would not work.

If that’s what you want you should file an issue with the spec so we can discuss this during the design meeting.

You don’t need a new method, the existing emit should work just fine.

 

Mihai

 

From: Gurevich, Vladimir <vladimir...@intel.com>
Sent: Monday, June 7, 2021 6:27 AM
To: Gergely Pongracz <Gergely....@ericsson.com>; hem...@mnkcg.com; Mihai Budiu <mbu...@vmware.com>; jnfo...@cs.cornell.edu
Cc: p4-d...@lists.p4.org

Subject: Re: [P4-design] Re: new language features for "5G" and "edge" use cases

 

All,

 

To be honest, I think that what be more useful is to have an unconditional emit 😊. Aside from the fact that it will, obviously allow us to express the traditional emit as:

 

pkt.emit(hdr.a) ::=
    if (hdr.a.isValid()) {

        pkt.unconditional_emit(hdr.a);

    }

 

it would also allow us to be more flexible with other emits and allow us to optimize the programs on certain architectures.

 

As a practical example, it is quite helpful when someone needs to create a bridge header in PSA, TNA or a similar architecture “on the fly”, e.g:

 

header bridge_h {

     type1_t f1; /* Usually comes from meta.f1 */

     type2_t f2; /* Usually comes from meta.f2 */

}

 

struct my_headers_t {

    ethernet_h ethernet;

    . . . .

}

 

control IgressDeparser(

    packet_out   pkt,

    my_headers_t hdr,

    my_meta_t    meta)

{

    /* Emit on the fly and there is no need to keep the header validity bit around */

   pkt.uncoditional_emit<bridge_h>(meta.f1, meta.f2);

    pkt.emit(hdr);

}

 

Currently we have to make the header  valid and copy the data into it (typically at the end of the control) and while the compiler can optimize many of these assignments and such, it is always better when someone can clearly express the intent.

 

Again, it is not that a big deal (it is possible to live without this), but I thought it will make the overall design more orthogonal. The name “unconditional_emit()” is certainly not the prettiest one – I put it here for clarity only.

 

Best,

Vladimir Gurevich

Principal Engineer,  Barefoot Division (BXD)
Director, Intel® Connectivity Education Hub

 

Email:      Vladimir...@intel.com  

Cell:         +1 (408) 833-4505

A close up of a sign

Description automatically generated

image001.png

thomas....@gmail.com

unread,
Jun 7, 2021, 6:56:27 PM6/7/21
to p4-d...@lists.p4.org

Hi everyone,

I agree with Vladimir Gurevich, the emit extern should not be conditional. In my opinion, it is redundant to have an extern that implements a construction that is already present in the language (i.e. hdr.isvalid()). As an example, in the current situation, if a compiler pass can simplify some headers that are always invalid, a specific pass must be added for the emit extern.

I think an interesting solution to allow legacy code would be to add a new extern “append” which would take a vector of bits as a parameter. This extern would append to the packet the vector of bit unconditionally.
Then we could simply convert the emit extern as :

if(hdr.a.isValid()) {
pkt.append(hdr.a)
}

Thanks

Thomas

hem...@mnkcg.com

unread,
Jun 10, 2021, 11:51:49 AM6/10/21
to p4-d...@lists.p4.org

Here is strawman  P4 events code to implement BaaS. shared_register is an extern used by the Ibanez paper

 

enum Events {

    set_timer,

    del_timer,

    exp_timer

}

 

struct metadata_t {

    Events    ev;

    …

}

 

The parser sets events for set_timer and del_timer. When set_timer is used, a callback is registered which sets exp_timer event.

 

              shared_register <bit <32 > >( NUM_REGS ) bufSize_reg ;

        bit<64> expire_time = 2;

        if (meta.ev == Events.set_timer) {

         meta.handle = timer_start(expire_time);

         bufSize_reg . write (handle , hdr);

     } else if (meta.ev == Events.del_timer) {

         timer_stop(meta.handle);

         bufSize_reg . write (handle , 0 );

     } else if (meta.ev == Events.exp_timer) {

        resend(hdr);

        meta.handle = timer_start(expire_time);

        bufSize_reg . write (handle , hdr);

     }

 

We will need time_start() and timer_stop() service from a Timer block similar to Traffic Manager in one architecture.  The code can be extended to support multiple timers. Right now, the code uses one timer and thus the single meta.handle is used.

 

Hemant

 

From: Gergely Pongracz via P4-design <p4-d...@lists.p4.org>
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster <jnfo...@cs.cornell.edu>
Cc: p4-design <p4-d...@lists.p4.org>

Mihai Budiu

unread,
Jun 11, 2021, 6:56:41 PM6/11/21
to p4-d...@lists.p4.org

Indeed, this should work.

It currently does not, so I filed an issue: https://github.com/p4lang/p4c/issues/2795

Hopefully this (and support in your favorite back-end) is everything that is needed.

If not, we’ll file more issues.

 

Mihai

 

From: Gurevich, Vladimir <vladimir...@intel.com>
Sent: Thursday, June 10, 2021 11:55 AM
To: Mihai Budiu <mbu...@vmware.com>; Gergely Pongracz <Gergely....@ericsson.com>; hem...@mnkcg.com; jnfo...@cs.cornell.edu
Cc: p4-d...@lists.p4.org

Subject: Re: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Hello Mihai,

 

So far, I am a little hesitant to allow pkt.emit() for the scalar types, mostly because I do not have a good proposal on how to handle non-byte-aligned ones.

 

I wonder, whether a construct like:

 

       pkt.emit((bridge_t){f1, f2});

 

will do the job of an unconditional emit, since we are effectively creating a valid header on the spot, so the compiler will be free to optimize the validity check out. At the same time, the byte alignment requirement can still be enforced.

 

Thanks,

From: Gergely Pongracz via P4-design <p4-d...@lists.p4.org>


Reply-To: Gergely Pongracz <Gergely....@ericsson.com>
Date: Monday, June 7, 2021 at 12:30 PM
To: "hem...@mnkcg.com" <hem...@mnkcg.com>, "mbu...@vmware.com" <mbu...@vmware.com>, "jnfo...@cs.cornell.edu" <jnfo...@cs.cornell.edu>

Cc: "p4-d...@lists.p4.org" <p4-d...@lists.p4.org>
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Hi,

Cc: p4-d...@lists.p4.org
Subject: [P4-design] Re: new language features for "5G" and "edge" use cases

 

If one sees Gergely’s slides (https://opennetworking.org/wp-content/uploads/2021/05/Gergely-Pongracz-Slides.pdf), he has also asked for conditional emit in deparser.  p4c has already added an if condition to the P4 parser during this year.  We should discuss use of conditional in deparser, at least for certain low speed architectures.

 

Thanks,

 

Hemant

image001.png
image002.png

hem...@mnkcg.com

unread,
Jun 11, 2021, 7:10:46 PM6/11/21
to p4-d...@lists.p4.org

I don’t see a way out unless start and stop timer are added as new externs. 

Gergely doesn’t like externs because the code is not portable.  But the new externs can be added for all software p4c backends. If externs are not used, what other choice do we have?  Maybe we can discuss this question in the June 14th LDWG meeting.

 

Thanks,

 

Hemant

 

From: Hemant Singh via P4-design <p4-d...@lists.p4.org>

hem...@mnkcg.com

unread,
Jun 16, 2021, 7:36:53 PM6/16/21
to p4-d...@lists.p4.org

If a packet is not stored in a register, define a new BaaS control which has one arg as packet_in. I think P4 would need to define a new data struct to store whole packets.

 

Hemant

 

From: Mihai Budiu <mbu...@vmware.com>
Sent: Wednesday, June 16, 2021 3:13 PM
To: Gergely Pongracz <Gergely....@ericsson.com>; hem...@mnkcg.com; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Shouldn’t we have this conversation on the design mailing list? More people may want to weigh in.

 

We have been making moves towards factoring psa.p4 and pna.p4 into a set of common libraries, e.g., common externs (like registers). Other libraries will be useful, like standard protocol header definitions. So I think that in the long term there will be a set of useful libraries, e.g., timers, buffers, etc., and a target architecture may include several of them, signaling in this way what is available.

 

You cannot store the payload of a packet in a register, only the headers.

 

Mihai

 

From: Gergely Pongracz <Gergely....@ericsson.com>
Sent: Wednesday, June 16, 2021 2:31 AM
To: Mihai Budiu <mbu...@vmware.com>; hem...@mnkcg.com; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Hi,

 

I think it would make sense to think whether such functionality could be implemented on non-CPU systems or not. If not (always) then I fully agree to have them as externs. But it would be of course in my view better to have them as part of PSA or PNA or some future, widely adopted architecture. Of course these calls will use “extern” functions in the underlying system, but having them as part of a few generic architectures would be good because that would mean better portability of the P4 codes.

 

Having buffers in a Tofino-like device doesn’t seem to be impossible, as Hemant pointed out below one could use register arrays for that purpose. Of course the size is limited, but the functionality seems to be there.

 

I guess the issue is more how we implement callbacks on a Tofino-like device.

 

One workaround could be to have a register array with the packets and timeout events just how Hemant described below and on each (or each Nth) packet arrival we’d check the arrival time and compare it with the timer in the first entry of the reg array. If we exceeded the timer we’d clone the packet and would do the timer based event on the copy (basically dropping the original packet and getting the buffered one to work on). This could fly for one specific timeout (e.g. a fix 100 msec resend timer), in which case the register array contains events as an ordered list (well, better say a ring buffer, but still the timestamps are constantly ascending). Of course this is nasty and should be hidden under some nice API, I just wanted to check whether it could work or not.

 

Gergely

 

 

 

From: Mihai Budiu <mbu...@vmware.com>
Sent: Monday, June 14, 2021 7:55 PM
To: hem...@mnkcg.com; Gergely Pongracz <Gergely....@ericsson.com>; jnfo...@cs.cornell.edu
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

I don’t know of any programming language where timers are a built-in construct, they are always library functions, i.e., externs for us. So I don’t see a choice. Perhaps the question is “which architecture/library file should the timers be a part of?”

 

Mihai

hem...@mnkcg.com

unread,
Jun 17, 2021, 11:24:14 AM6/17/21
to p4-d...@lists.p4.org

To store packets, I would need an array of Packet_in in P4 which is a new construct for P4.

Mihai Budiu

unread,
Jun 17, 2021, 8:17:20 PM6/17/21
to p4-d...@lists.p4.org

Packet_in is an extern. There are no operations on extern except instantiation and method calls.

In particular, there is no assignment between externs (except compile-time binding as parameters).

If you want to do something like this you will probably have to invent a new extern to represent a dynamic array.

 

Mihai

Gergely Pongracz

unread,
Jun 18, 2021, 5:05:52 AM6/18/21
to p4-d...@lists.p4.org

I guess the “only headers can be stored” could work as long as that header can be 1,5k byte long. 😉

Unfortunately I guess this is not the case, so my proposed workaround doesn’t really work generally.

 

Then as Hemant said: we’d need something (extern, language construct) to be able to work on the payload. Or we’d need to increase the size of the header structure to 1,5k (this way jumbo frames will still cause problems, but the majority of the use cases would work). Actually I think payload buffer would be simpler, as that would require less functionality (basically store, send, delete), but if we could extend the header that would open up interesting possibilities, e.g. http/sip parsing, proxies, DPI.

 

BR,

Mihai Budiu

unread,
Jun 18, 2021, 4:35:37 PM6/18/21
to p4-d...@lists.p4.org

A header is defined by a P4 parser. If you are willing to parse an entire packet then you won’t have a payload. But in general, you can’t write parsers for arbitrary-size packets, even using varbits – in the extract call you have to *know* the header size that you want to parse.

 

Moreover, hardware devices like Tofino have a limit on both the number of parser transitions that they can execute for a packet and the number of bytes parsed per state. These are very natural, if you expect N packets/second/parser and you need P parser transitions per packet, then you need to perform P * N * clock period transitions/second to parse the packets (a device like Tofino has fewer parsers than input ports). This puts a bound on P if you want to sustain this bandwidth without dropping packets (or buffering them). You can statically bound P for the worst case, and reject programs that have parsers that are too complex, or assume something about the expected duration P for the mix of packets you are getting (P is not a constant, it depends on the packet) and hope that you have enough parser bandwidth to cope with all packets in practice.

 

For a NIC N is much smaller than for a router, so it may be feasible to do complex parsing, but for a high speed switch you really don’t have enough time to do deep parsing at line rate.

hem...@mnkcg.com

unread,
Jun 22, 2021, 3:26:24 PM6/22/21
to p4-d...@lists.p4.org

First, thanks to Gergely to open source his code.

 

Second, I found two issues (https://github.com/p4lang/p4c/issues/2793 and https://github.com/p4lang/p4c/issues/2797) with the ebpf backend with the code and Mihai did a stellar job with fixing both issues. Now the ebpf backend is reaching production quality – great news! Kudos to Mihai.

 

Thanks,

 

Hemant

 

From: Gergely Pongracz via P4-design <p4-d...@lists.p4.org>
Sent: Friday, June 04, 2021 5:14 AM
To: Nate Foster <jnfo...@cs.cornell.edu>
Cc: p4-design <p4-d...@lists.p4.org>

hem...@mnkcg.com

unread,
Jul 26, 2022, 4:26:54 PM7/26/22
to mbu...@vmware.com, Gergely....@ericsson.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Today, I thought some more about this one. 3GPP standards mandate up to 65k packet buffering which no switching asic running at 12 Tbps can support. Further, buffering a few packets for reassembly of fragmented packets is possible in asic but, reassembly is an end host function – why do reassembly in a switching asic?

 

Even a cpu system such as VPP has a tight processing budget. VPP runs two instructions in one clock cycle across the board.

 

One could use a high end FPGA for this work. But P4 to FPGA synthesis has some limitations for what P4 subset is supported.

 

The cpu system seems to be the only choice. VPP supports a periodic process for any plugin and the process processes events defined in an enum. So timer can be avoided. Further, since VPP uses DPDK for packet i/o, if one uses one packet in VPP vector, one is running DPDK with benefits of VPP CLI, tracing, and control plane.  Regarding P4toVPP, my company has a compiler, https://mnkcg.com/products/p4-to-vpp-compiler/ .

Gergely Pongracz

unread,
Aug 4, 2022, 8:22:03 AM8/4/22
to hem...@mnkcg.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Hi,

 

Thanks for resurrecting this old thread, nice surprise.

 

I think this 65k limit can be made less strict in the coming releases (e.g., 6G timeframe) if we have good arguments for that. One good argument could be that it can’t be supported in asics. But there is one misunderstanding. This is a retransmit buffer only, so no segmentation / reassembly is needed. Regarding the number of packets: since this is a retransmit loop it is basically characterized by the over the air RTT and the bandwidth. In 5G the typical values are 5-10 msec and a few Gbps per cell. So we’ll need 1-2 MB space for the retransmit buffer. And a typical base station handles multiple cells, so the buffer will be in the few tens of MB range.

 

I do agree that this is not a suitable task for a Tofino-like device. In RAN we don’t have terabits per sec bandwidth, but we might want to do more processing per packet. A CPU on the other hand seems a bit too heavy for this.

BR,

 

Gergely

hem...@mnkcg.com

unread,
Aug 4, 2022, 9:51:13 AM8/4/22
to Gergely....@ericsson.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Thanks for the clarifications. Seems like a smartNIC would work for retransmit buffer.  In your slides you say a SmartSwitch runs RLC at 5000 Mpps – what asic did the switch use and how did you program the asic? Used C?

 

Hemant

Gergely Pongracz

unread,
Aug 5, 2022, 9:04:38 AM8/5/22
to hem...@mnkcg.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Hi,

 

The real base station asic is pretty much a DSP as most of the task is to en/decode signals. But I don’t know the latest details about the bst hardware. In our prototype we decoupled the signal processing part from the rest, so in our case we used a Tofino (not very ideal) for the packet processing tasks + a CPU (or a k8s service) for buffering / retransmission. Tofino code was in P4, while the CPU code was in C (using the dpdk libraries).

hem...@mnkcg.com

unread,
Aug 5, 2022, 3:14:49 PM8/5/22
to Gergely Pongracz, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Thanks for the details.  

 

Why was Tofino not very ideal?

hem...@mnkcg.com

unread,
Aug 9, 2022, 9:46:19 AM8/9/22
to Gergely Pongracz, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Got it, thanks.  

 

The Cavium/Marvell Octeon has been used for your Tofino tasks but the Octeon is not P4 programmable. The Octeon is flexible with 16, 32, or 42 cores. If your need is 1 Tbps or little less, VPP can run IPv4 crypto on x86 at 1 Tbps on a single machine. While you wait for a smaller Tofino or the Octeon to be P4 programmable, consider VPP because point-to-point flows will run faster in VPP than dpdk.

 

Spinning a new asic or writing a new p4c backend for Octeon are long poles. If I want to ship today, VPP works and a P4toVPP compiler and tools are available from my company. I also said our compiler supports P4toDPDK by running one packet in VPP vector. Additionally, since VPP uses dpdk for packet i/o , one gets HQF qos from dpdk – no switching asic supports HQF and HQF is the only means I know of to provide qos for individual UE flows.

Hemant

 

From: Gergely Pongracz <Gergely....@ericsson.com>
Sent: Tuesday, August 09, 2022 4:54 AM
To: hem...@mnkcg.com; mbu...@vmware.com; jnfo...@cs.cornell.edu
Cc: p4-d...@lists.p4.org
Subject: RE: [P4-design] Re: new language features for "5G" and "edge" use cases

 

Hi,

 

What I meant is that Tofino supports much higher bandwidth compared to our needs. If one would design a “small Tofino”, e.g., around terabit/sec (maybe even under) that consume (much) less power and costs less that would be ideal for this task.

Gergely Pongracz

unread,
Aug 9, 2022, 10:30:21 AM8/9/22
to hem...@mnkcg.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Hi,

 

What I meant is that Tofino supports much higher bandwidth compared to our needs. If one would design a “small Tofino”, e.g., around terabit/sec (maybe even under) that consume (much) less power and costs less that would be ideal for this task.

 

Gergely

Gergely Pongracz

unread,
Aug 9, 2022, 4:46:36 PM8/9/22
to hem...@mnkcg.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Hi Hemant,

 

It’s tempting to try this, although afaik an x86 server consumes roughly the same power as a Tofino – but at least it’s cheaper. Currently we have other activities prioritized, but if you give us a link to your compiler it seems a relatively short task to give it a try. But I don’t want to mislead you: our activity was far from a product, it was pure research to understand the capabilities of P4 and similar languages in our domain. Actually it was a continuation of the activities documented here: http://lendulet.tmit.bme.hu/~retvari/publications/jsac_2018.pdf

 

Actually we also had a P4 to Octeon compiler (an old t4p4s branch), but we found that Octeon scaled nicely only up to 8 cores. Well, we had an old 24 core model (84xx or something like that), the newer ones might be better, but we discontinued that track.

hem...@mnkcg.com

unread,
Aug 9, 2022, 9:26:58 PM8/9/22
to hem...@mnkcg.com, Gergely....@ericsson.com, mbu...@vmware.com, jnfo...@cs.cornell.edu, p4-d...@lists.p4.org

Sorry, there is a typo in my email below. Octeon has a 48-core version, not 42.

 

Also, a switch uses a cpu besides the switching asic. If an Octeon is used, one or two cores are available for cpu operations such as a control plane.

 

Hemant

--
You received this message because you are subscribed to the Google Groups "p4-design" group.
To unsubscribe from this group and stop receiving emails from it, send an email to p4-design+...@lists.p4.org.
To view this discussion on the web visit https://groups.google.com/a/lists.p4.org/d/msgid/p4-design/0c1501d8abf6%246491a240%242db4e6c0%24%40mnkcg.com.

hem...@mnkcg.com

unread,
Aug 11, 2022, 9:33:16 AM8/11/22
to Gergely Pongracz, p4-d...@lists.p4.org

Gergely,

 

Here is your 1 Tbps switching platform using the Octeon.  See slide 13 at the link below.

 

https://www.marvell.com/content/dam/marvell/en/company/media-kit/octeon-10/marvell-octeon-10-media-deck.pdf

 

MNK’s p4c backend homogenizes dpdk and vpp software development for any target to just using vpp. Use our compiler to generate a vpp plugin using one packet in vector and one is running dpdk with the benefits of vpp infrastructure (cli, api, tests, packet trace, debug, etc.).

 

Marvell should know of an ODM vendor to get a switch using the Octeon. You can get P4toVPP compiler from my company.

 

The Octeon also supports vpp in hardware. I’d like to understand how is the vpp hardware sub component programmed. A pointer to an open doc would help or if not, I can sign an NDA to get any doc.

Reply all
Reply to author
Forward
0 new messages