Xelerated Xpress

Insight on Carrier Ethernet and Beyond

Busy Times

Today marked a major milestone for Xelerated as we announced the availability of samples of the HX network processor. The chip is back from manufacturing, and it is running at full speed in the lab. Samples are also with customers at this point, and the whole company is going full steam ahead to support our customers in the design of next generation Carrier Ethernet switches and routers.

Over the next few months, we are planning a series of events to demonstrate the HX network processor’s wirespeed capabilities. Stay tuned for more news and information on 100 Gbit/s wirespeed network processing.

Update:  Some online links to the news coverage on this story -

by Per Lembre on Jun. 30th, 2010

| Comment

When There Is a Special Purpose

Network processors are very flexible. They can be programmed for any type of data plane services. Every customer project is unique; all organizations want to compete with features and functionality as they continue to come up with new, innovative ways to make the most out of NPU silicon.

Still, there are limitations to what network processors can do. They are developed for a special-purpose; to process packets very efficiently. What they do, they do very well. But if you want to accomplish anything other than process packets, you will need another type of processor. General-purpose multicore processors would be the choice. But can’t they be used for packet processing as well?

For a deep-dive to this subject, I recommend reading Håkan Zeffer’s and my recent article in Electronic Design. We compare the differences between the special-purpose dataflow architecture, which is found in all Xelerated NPUs, and general-purpose multicore architectures that are popular in today’s server architectures. Both have their merits, and a comparison should be made for the target application. When looking to the architectures, you can start to calculate how efficient the different approaches are for different types of applications. I hope you find the conclusions interesting.

by Per Lembre on May. 28th, 2010

| Comment

Multi-core Disappointment – Here We Go Again

It is time this industry learns from historic mistakes. If not, we may spend huge amounts in engineering efforts only to discover the path taken is a dead end.  I read a recent article by Simon Stanley of Light Reading, and in it his research indicates that multi-core processors are being evaluated from applications in the network processing field. I don’t see this trend when talking to the major network equipment vendors, and this might be  because they also remember what multi-core processors couldn’t deliver ten years ago.

Those who are seriously evaluating multi-core architectures for packet processing should be prepared for some surprises. Again. The same evolution happened several times before with separate processors brought together in a multi-processor architecture on the same die to scale processing performance.  Have people already forgot about why previous multi-core proposals for packet processing did not fly?

First, multi-core architectures consume a lot of power. Second, they are not designed for deterministic wirespeed performance. Third, they are difficult to program efficiently – making it hard to meet the performance requirements in modern packet processing applications.  And using ANSI-C does not help the inefficiency and performance challenges.

Multi-core processors are designed for general purposes, and they are therefore not optimized for packet processing.  They lack the necessary service density. Xelerated’s Dataflow Architecture, in contrast, was designed to solve the challenge of combining programmability and super-efficient packet processing. It is a linearly scalable wirespeed-by-design processing architecture with low power and a great amount of service density.

Support for 40 or 100 G interfaces does not say anything about the device’s ability to perform a meaningful application at these speeds. And when looking into the requirements in advanced Carrier Ethernet, Fiber Access or Mobile Backhaul applications, general-purpose multi-core designs continue to fall short. To give you an idea: Xelerated’s new HX330 has over 900 percent greater service density compared to the most high-end multi-core processor on the market. That is, it has 9 times the processing capacity for network and packet processing!

It is time to learn from history. Multi-core architectures have a bright future in general applications, for the server and consumer markets. Here is where they belong – processing applications, not processing packets.

There is a reason why 20+ NPU vendors that spent multi-million dollars in multi-core architectures failed to deliver a commercial and technically viable option to the networking industry. This history is just ten years away. I’m confident network equipment vendors have a longer memory than this.

by Thomas Eklund on Apr. 7th, 2010

| Comment

Higher Synchronization Demands, Not Less

My colleague Tord Haulin gave me this brief summary of the synchronization demands as mobile backhaul networks goes packet. You can think of the demands as a three-step evolution:

1. Distribution of precise frequency

Requirement for e.g. WCDMA/UMTS transport equipment: 15 parts per billion (ppb)

This means: OK to deliver up to a dozen too many/too few clock cycles every day (for a 8kHz reference frequency).

2. Distribution of locked frequency with tight phase control

Requirement for e.g. Fixed WIMAX transport equipment 4.3 micro seconds

This means: The precise number of clock cycles has to be delivered every year. It is OK to lead or lag up to a dozen of degrees at any time (also for a 8kHz reference)

3. Distribution of time of day

Requirement for e.g. radio interface of CDMA/CDMA2000: 1 micro second.

This means: Time carrying signal cannot be sent more than 150 m without compensating for transport delay. That would be one city block(!) 

To measure time in distance of speed of light, and frequency wander in lost clock cycles, put the hard mobile infrastructure requirements on frequency, phase and time of day distribution into perspective. For those interested in more details, I recommend taking a look at Xelerated’s precision time solution white paper.

More on this topic will also be presented at the upcoming MPLS & Ethernet World Congress 2010 and Ethernet Wholesale Summit 2010 in Paris, February 9-12, 2010. Xelerated will be there as well as many of our industry peers.

Look forward to seeing you there!

by Per Lembre on Jan. 27th, 2010

| Comment

Today Is Celebration Day

December 10 is known to be the date for the Nobel prize ceremony in Stockholm and as a coincidence, the Global Semiconductor Alliance (GSA) chose the same date for this year’s GSA awards dinner celebration in Santa Clara. So when I watch the Swedish television broadcasting from the elegant banquet at the Stockholm City Hall, my thoughts are with my colleagues in Santa Clara. Xelerated was awarded for Outstanding Financial Performance by Private Semiconductor Company.

This is a good day to celebrate.

gsa_awards_winner

Blog token for Technorati claim: NS7RMYN7GWMZ

by Per Lembre on Dec. 10th, 2009

| Comment

IEEE P802.3ba version 3 Is Ready

It is encouraging to see that the standardization work of 100 GE and 40 GE is coming to an end. Draft version 3, more formally IEEE P802.3ba/D3, is now ready and it looks like it will get passed and submitted to the Sponsor Ballot.

This is the formal start of next generation Ethernet innovation. This is good news for the industry, and well supported by us here at Xelerated.

by Thomas Eklund on Dec. 3rd, 2009

| Comment

ADSL Is Losing to Fiber

I read a very interesting article in Computer Sweden which presents the latest broadband statistics for the Swedish market. Fiber-based broadband access is growing,  and copper-based ADSL is losing market share.  Fiber is not only growing faster, but now, for the first time, the number of ADSL subscriptions are actually declining. Also, more people are using IP telephony than POTS – however both are losing to cellular phone services.

Yet another statistical proof point that copper-based services are on a downhill slope. The pace of change to fiber seems to be faster than many of us have expected.

by Thomas Eklund on Nov. 27th, 2009

| Comment

Time to Clarify Service Density

This industry struggles with a communication issue. One of the most important aspects of network processing – service density – lacks a common definition, and because of this, there is no widely accepted way to measure it.

Pipeline of processor cores and engine access points

The density of processing resources, here illustrated by a part of the single pipeline of processor cores and engine access points featured in the Dataflow Architecture, defines how much services a chip can support.

We all know how to measure link bandwidth; we do this in Gigabits per second (Gbps). Likewise, we have a common understanding of how to measure the raw performance of packet processing; Megapackets per second (Mpps). But we don’t fully agree on how to measure the capabilities of parsing, classification and modification – important tasks which are performed by the network processor (NPU).

Over what is now a few generations of NPU development at Xelerated, it is fair to say we spend a large part of our system engineering on increasing service density. The first generation of NPUs, the X10q family, was initially released in 2002. It was a 40 Gbps NPU with 40-100 Mpps, depending on type. In the next generation, the X11 Family of NPUs, the greatest achievement was (yes, you are correct!), increases in service density.   (Okay, some people may argue that the integration of more interface types and GE MACs was key to the success. But I still think  the list of services and features the X11 performs – in parallel and in wirespeed – must be one of the greatest achievements in the industry at the time).

Now moving to the third generation, the HX family of NPUs. This is a 100 Gbps and 150 Mpps NPU family, which is significantly more than the X11, sure. But again, we make a giant leap forward in terms of service density. This means more instructions per packet and more lookup bandwidth per packet type. The result is that more services can be delivered in a single chip.

For the system vendors, making the correct assumptions on service density is one of the most strategic tasks for product management. There are many cases where the HX device consolidates three to four ingress and egress processing chips (be it custom ASICs or merchant NPUs) into one. It impacts COGS, margins, and in the end, the whole business case for the product.

The lack of a commonly accepted definition of service density makes the dialog between silicon and system vendors unnecessary blurred and full of misunderstanding.

So let us start the job of defining the term. I do not have a perfect answer to this. As can be seen at the Xelerated product pages, we measure service density only by comparing the capabilities of the different devices within the same family. As we lack a more general definition, we could not measure service density across product families. Yet.

The definition should take both the number of operations per packet and the classification of resources per packet into account. And another component also needs to be added to the equation: the packet processing needs to be achieved at wirespeed for a specified link rate. Without a hard performance target in terms of Mpps, the whole discussion just falls short. So let’s get the discussion rolling… for the evolving services in the metro space, next generation platforms will be dependent on this definition.

by Per Lembre on Nov. 24th, 2009

| Comment

Observations at the Linley Data Center Seminar

Data CenterI attended Linley Group’s data center seminar this Tuesday to learn more about the latest data center trends.  Xelerated’s Anders Wirkestrand presented on the Network Processor Unit’s (NPU) role in data centers as a key catalyst for virtualization.

One key observation is that an NPU has an enormous amount of service density and could, when used together with a multicore processor, increase the overall performance and transaction rate while lowering the power significantly. The number of instructions per packet is 23 times (yes, you read that correctly!) over a state-of-the-art Intel Core 2 Extreme QX9770 multicore processor. Combining the strength of a state-of-the-art multicore processor and an Xelerated NPU can dramatically improve the overall solution.

It is interesting to see that the data center server players are adding switching functionality, and at the same time, the router and switch vendors are adding server functionality. They end up competing with each other. When the switch becomes a server, and the server becomes a switch, it opens up for a period of strong innovations.  Which architecture model will prevail in the future?

by Thomas Eklund on Nov. 13th, 2009

| Comment

Technology Recycling

Technology RecycleIt is funny to see how many good ideas come at the wrong time, but can later be recycled when the timing is right. As we all know, predicting the timing of a technology isn’t the easiest.

Look at the famous God Boxes that everyone thought were dead. Today it is the hottest new architecture in Transport and it’s called P-OTS  (hmm, I can’t think of a worse name for a plain old telephone system –  sorry packet optical transport system).

Another famous old technology – thin clients.  But, now what does cloud computing enable?  That’s right – thin clients. As we enter the “Google world of Internet” more and more applications will reside in the cloud – enabled by high speed access networks and virtualization in the data centers.

IP as a bearer of data in radio access network ten years ago was interesting from a technology point of view but never gained any traction with the carriers. Today on the other hand, you see everyone launching new IP or Ethernet-based Radio Access Equipment to carry mobile data.

So whats next – IPv6?

by Thomas Eklund on Nov. 10th, 2009

| Comment