Xelerated Xpress

Insight on Carrier Ethernet and Beyond

Multi-core Disappointment – Here We Go Again

It is time this industry learns from historic mistakes. If not, we may spend huge amounts in engineering efforts only to discover the path taken is a dead end.  I read a recent article by Simon Stanley of Light Reading, and in it his research indicates that multi-core processors are being evaluated from applications in the network processing field. I don’t see this trend when talking to the major network equipment vendors, and this might be  because they also remember what multi-core processors couldn’t deliver ten years ago.

Those who are seriously evaluating multi-core architectures for packet processing should be prepared for some surprises. Again. The same evolution happened several times before with separate processors brought together in a multi-processor architecture on the same die to scale processing performance.  Have people already forgot about why previous multi-core proposals for packet processing did not fly?

First, multi-core architectures consume a lot of power. Second, they are not designed for deterministic wirespeed performance. Third, they are difficult to program efficiently – making it hard to meet the performance requirements in modern packet processing applications.  And using ANSI-C does not help the inefficiency and performance challenges.

Multi-core processors are designed for general purposes, and they are therefore not optimized for packet processing.  They lack the necessary service density. Xelerated’s Dataflow Architecture, in contrast, was designed to solve the challenge of combining programmability and super-efficient packet processing. It is a linearly scalable wirespeed-by-design processing architecture with low power and a great amount of service density.

Support for 40 or 100 G interfaces does not say anything about the device’s ability to perform a meaningful application at these speeds. And when looking into the requirements in advanced Carrier Ethernet, Fiber Access or Mobile Backhaul applications, general-purpose multi-core designs continue to fall short. To give you an idea: Xelerated’s new HX330 has over 900 percent greater service density compared to the most high-end multi-core processor on the market. That is, it has 9 times the processing capacity for network and packet processing!

It is time to learn from history. Multi-core architectures have a bright future in general applications, for the server and consumer markets. Here is where they belong – processing applications, not processing packets.

There is a reason why 20+ NPU vendors that spent multi-million dollars in multi-core architectures failed to deliver a commercial and technically viable option to the networking industry. This history is just ten years away. I’m confident network equipment vendors have a longer memory than this.

by Thomas Eklund on Apr. 7th, 2010

| Comment

The Need for More Service Density

Service Density evolutionNew computer services tend to take advantage of the latest processing and storage resources.  A new hard drive may give you ten times the storage capacity, but you may still find it half full only one year after the initial purchase.  This is also when you find the PC’s dual core processor running at 2.1 GHz is under constant heavy load.  Is this a rule by nature? It quite possibly could be.

The Network Processor (NPU) industry is no different. A new generation comes with significant more processing power, and still our customers ask for more. There are more standards coming down the road, and there are more features requested by service providers around the world. Demand is always more, never less.

At the recent Linley tech seminar Xelerated’s Vice President of Business Development, Thomas Eklund, delivered a presentation – available for download on Slideshare – on the importance of service density in the NPU business. If your processing device can’t manage all the network services expected, then you will find your Research & Development department under hard stress to balance features and performance against available resources.  The need for greater service density is an important aspect for the whole service provider industry. If next generation routers and switching platforms can be built with greater headroom for new services, we can extend the lifetime of the equipment, and thereby strengthen the business case for the broadband services.

Service density can be a bit hard to measure, but doing some very basic calculations on the raw service processing capabilities of a particular chip is fairly straightforward and this will give you a rough idea on the capabilities of the chip. Failing to do this in an evaluation process for a new line card design often leads to unhappy surprises at a very late stage in the project. In turn this results in risk of missing an important market window, as well as signficantly increased engineering and product costs.  So take a look and find out – how much service density is your chip providing?

Update: this post was re-published at Advanced TCA Systems Web Site.

by Per Lembre on Mar. 10th, 2010

| Comment

Time to Clarify Service Density

This industry struggles with a communication issue. One of the most important aspects of network processing – service density – lacks a common definition, and because of this, there is no widely accepted way to measure it.

Pipeline of processor cores and engine access points

The density of processing resources, here illustrated by a part of the single pipeline of processor cores and engine access points featured in the Dataflow Architecture, defines how much services a chip can support.

We all know how to measure link bandwidth; we do this in Gigabits per second (Gbps). Likewise, we have a common understanding of how to measure the raw performance of packet processing; Megapackets per second (Mpps). But we don’t fully agree on how to measure the capabilities of parsing, classification and modification – important tasks which are performed by the network processor (NPU).

Over what is now a few generations of NPU development at Xelerated, it is fair to say we spend a large part of our system engineering on increasing service density. The first generation of NPUs, the X10q family, was initially released in 2002. It was a 40 Gbps NPU with 40-100 Mpps, depending on type. In the next generation, the X11 Family of NPUs, the greatest achievement was (yes, you are correct!), increases in service density.   (Okay, some people may argue that the integration of more interface types and GE MACs was key to the success. But I still think  the list of services and features the X11 performs – in parallel and in wirespeed – must be one of the greatest achievements in the industry at the time).

Now moving to the third generation, the HX family of NPUs. This is a 100 Gbps and 150 Mpps NPU family, which is significantly more than the X11, sure. But again, we make a giant leap forward in terms of service density. This means more instructions per packet and more lookup bandwidth per packet type. The result is that more services can be delivered in a single chip.

For the system vendors, making the correct assumptions on service density is one of the most strategic tasks for product management. There are many cases where the HX device consolidates three to four ingress and egress processing chips (be it custom ASICs or merchant NPUs) into one. It impacts COGS, margins, and in the end, the whole business case for the product.

The lack of a commonly accepted definition of service density makes the dialog between silicon and system vendors unnecessary blurred and full of misunderstanding.

So let us start the job of defining the term. I do not have a perfect answer to this. As can be seen at the Xelerated product pages, we measure service density only by comparing the capabilities of the different devices within the same family. As we lack a more general definition, we could not measure service density across product families. Yet.

The definition should take both the number of operations per packet and the classification of resources per packet into account. And another component also needs to be added to the equation: the packet processing needs to be achieved at wirespeed for a specified link rate. Without a hard performance target in terms of Mpps, the whole discussion just falls short. So let’s get the discussion rolling… for the evolving services in the metro space, next generation platforms will be dependent on this definition.

by Per Lembre on Nov. 24th, 2009

| Comment

Latest blog entries

Archive

Places we like

Categories