jump to navigation

Catching Up – Lab49 Articles July 1, 2009

Posted by newyorkscot in Articles, HPC, Marketing, Visualization / UX.
Tags: , , , ,
1 comment so far

Life at Lab49 has been busy recently with a diverse set of client delivery, sales and marketing activities – all good news as the business continues to grow on both sides of the Atlantic.

On the marketing front, we have written a lot of great articles this year, and have been in a number of other features, providing commentary to the relevant stories. They can all be read on the Lab49 website here with the articles carved out in a separate page here. A few highlights:

Waters Special Report: IT in Crisis: Lab49 sponsored a special report in the SIFMA edition of the magazine, the theme of which how financial institutions should be addressing the current challenges presented by the financial crisis, and how, by investing in the right technology, firms can radically improve their agility and intelligence.

Robust, Reusable Drag-and-Drop Behavior in Silverlight discusses how developers can can greatly increase the overall robustness and re-usability of any drag-and-drop implementation in Silverlight, through the manipulation of an element’s RenderTransform with attached behaviors.

Riding the Tsunami discusses why firms need to update their trading and risk infrastructures and implement a holistic approach with a balance of powerful new technologies. 

Building a bank from scratch hypothesizes what it would take to rebuild bank systems from scratch to support real-time data aggregation and analysis, to provide a platform to capitalize on next-generation technologies, and to accommodate constant change. The article also outlines what principles, technologies and processes should be put in place to facilitate the optimal solution.

Concurrency: Take Control or Fail discusses how new trends in hardware demand the adoption of parallel programming throughout the financial enterprise, and why firms need to start getting up to speed now, or risk falling behind.

Advertisements

Waters Magazine: Flying By Wire August 1, 2008

Posted by newyorkscot in Agile, Articles, HPC, Marketing, Markets, Other.
Tags: ,
comments closed

Waters Magazine has just published my article “Flying By Wire” in its Open Platform section of the August issue. The article discusses how advanced trading systems need better control systems to dependably innovate and take new opportunities to market. I draw an analogy between trading systems and modern jet aircraft where stability, performance and control are essential characteristics that need to be considered during design, development and testing. Read the article on the Lab49 website here.

Really Massive Complex Event Processing February 20, 2008

Posted by newyorkscot in Complex Event Processing, HPC, Science.
1 comment so far

ScientificAmerican ran a feature this month on the Large Hadron Collider (LHC) being built by CERN to conduct  the largest physics experiments ever. Aside from its sheer physical scale, one of the remarkable aspects of the project is the massive volumes and frequency of data generated, causing it to be probably the most impressive combinations of complex event processing and distributed grid computing ever:

  • The LHC will accelerate 3000 bunches of 100 billion protons to the highest energies ever generated by a machine, colliding them head-on 30 million times a second, with each collision spewing out thousands of particles at nearly the speed of light.
  • There will be 600 million particle collisions every second. Each one called an “event”.
  • The millions of channels of data streaming away from the detector produce about a megabyte of data from each event: a petabyte, or a billion megabytes, of it every two seconds.
  • More details here and diagram here.

This massive amount of streaming data needs to be converged, filtered and then processed by a tiered grid network. Starting with a few thousand computers/blades at CERN (Tier 0), the data is routed (via dedicated optical cables) to 12 major institutions around the world (Tier 1), and then finally down to a number of smaller computing centers at universities and research institutes (Tier 2). Interestingly, the raw data coming off the LHC is saved onto magnetic tape (allegedly the most cost-effective and secure format).

I wonder how many nano-seconds they took to consider what CEP vendor they wanted to use for this project ?!!

Oracle Buying BEA January 16, 2008

Posted by newyorkscot in Complex Event Processing, HPC.
add a comment

Just announced an hour ago, for $8.5billion. Will now be interesting to see what the Weblogic product strategy looks like with respect to providing distributed cache integration, ie when will Coherence be part of the stack ?

Data Streaming Crosses the Chasm December 20, 2007

Posted by newyorkscot in Complex Event Processing, HPC, Marketing, Visualization / UX.
add a comment

Lab49‘s Daniel Chait provides SDTimes’ editor-in-chief David Rubinstein with his views on CEP, HPC, multi-core processing, WPF/Flex, etc, here.  Nice article, David.

Lab49 Client & Marketing Update.. December 12, 2007

Posted by newyorkscot in Client Engagement Mgt, Complex Event Processing, HPC, Marketing, SOA / Virtualization, Visualization / UX.
add a comment

The last few months and weeks have been a bit mad with a host of new client projects coming online, and a tidal wave of marketing activities.

On the project front, and despite some dodgy market conditions, we have been continuing to see (and have started) some interesting projects around automated trading: from market simulation environments to real-time pricing to risk management systems. We have also seen more projects on the buy-side where the level of innovation and adoption of the latest technologies is still impressive. In advanced visualization (specifically, WPF/Silverlight), we are starting to see interest across various trading businesses which is very promising going forward. We also continue to be involved in quite a few projects involving grid computing, distributed cache, etc.

On the marketing front, we have been busy publishing new articles, contributed to a number of features in various industry publications and are currently in the process of writing some thought leadership pieces for technology and finance publications. We have also been doing some great sales & marketing activities with some of our technology partners around, including working on some new client opportunities and developing some demos leveraging WPF and CEP platforms. (We will also be starting to talk a bit more openly about our various partnerships)

What’s great about the recent flurry of project and marketing activity has been the balance across high performance computing (grid, cache, etc), Java (J2EE, Spring, opensource), Microsoft (WPF, Silverlight) and other technologies (messaging, market data, visualization, etc), which really helps to show Lab49‘s depth and breadth across the technology space. Some highlights from the last few months include:

Lots more news, articles, features, partner updates, etc in the pipeline that I will post as they happen..

Cloud Computing from IBM and Google October 9, 2007

Posted by newyorkscot in HPC.
add a comment

Announcement today from IBM and Google that they are “..to offer a curriculum and support for software development on large-scale distributed computing systems, with six universities signing up so far.” With Amazon’s Elastic Compute Cloud, EC2, I wonder who is next to bring compute capabilities to the masses ?

New Capital Markets Benchmarking Council September 25, 2007

Posted by newyorkscot in Complex Event Processing, HPC.
add a comment

The Securities Technology Analysis Center (STAC) recently announced the creation of a new benchmark council that includes some of the leading securities firms such as JPMC, Citigroup and HSBC. The new council will establish benchmarks in three areas:

  • Market data: benchmarks based on workloads such as direct exchange-feed integration, market data distribution, tick storage and retrieval, etc.
  • Analysis: benchmarks based on workloads such as trading algorithms, price generation, risk calculation, etc.
  • Execution: benchmarks based on workloads such as smart order routing, execution-related messaging, etc.

“STAC Benchmarks will measure the performance of software such as market data systems, messaging middleware, and complex event processing systems (CEP), as well as new underlying technologies, such as hardware-based feed and messaging solutions, hardware-based analytics accelerators, compute and data grid solutions, InfiniBand and 10-gigabit Ethernet networks, multicore processors, and the latest operating system and server technologies.”

Benchmarks in complex event processing, huh ? That will be interesting. Will it be based on specific sets of use cases ? Will it give us insight into the hype and myth of the various CEP vendors’ proclamations of processing gazillions messages / sec ? Will it tell us what happens when you try to scale these products ? I wonder what the various CEP vendors think about this …?

Some FAQs here.

Street#Grid 2007 Impressions April 19, 2007

Posted by newyorkscot in HPC.
add a comment

I attended the Street#Grid conference at the W Hotel in Union Square the other day — this is the first NY version of the City#Grid event held at the end of last year in London. This was definitely a more niche event compared to most and was pleasantly low on the vendor-factor, with only some of the main players there: Datasynapse, Platform, Tangosol (now Oracle!), Gigaspaces as well as Microsoft’s Cluster Server.

What I liked about this conference was that is was more focused on the experiences of some of the main investment banks who have been doing grid and data fabric work for a while. However, there was clearly a difference of opinion between the banks and the vendors in terms of what is important from a technology standpoint, and as Marc mentions there was definitely a sad view of where the actual application developers sit in terms of priorities. Wachovia, Bear Stearns and JPMorgan all mentioned that their top priority is in the manageability of their grids, and placed less importance in giving developers the tools to use the grids. At the same time (well actually, towards the end of the last session) Adrian Kunzel, Global Head of Architecture of Investment Bank Technology at JPMorgan, stated that (quote) “…our developers can’t develop multi-threaded code” !! Which is it ? They can’t develop applications for grid because you have not given them the tools, or you are not going to prioritize giving them tools because they can’t develop multi-threaded code ?

There was a lot of chat about virtualization, provisioning, etc but ultimately what it all boils down to was getting better utilization out of the compute assets of the bank. Also, the validity of “outsourced CPU” should be questioned — given that <70% utilization prevails in the banks, the demand has not yet met the supply internally, so what is really needed are better ways to utilize the grid and to “find new workloads” for the grid.

Bob Hintze, VP Utility Computing, Wachovia, gave a pretty good and practical presentation on his views of running a grid at the bank. I liked his views on making decisions quickly and basing the choice of vendor on actual ‘use cases’ that actually mean something to the business. Too many people try to make decisions on potentially building the “intergalactic, all vendor, all problem” solution. Bob also stated that he has less focus on SDKs, etc, and is more interested in (global) manageability and needs more functionality such as logging, etc. That said, he later made the point that we need to be able to easily provide grid environments for disaster recovery (DR), business continuity, development and testing alike. Re: DR, he would prefer better “high availability” than DR since they are inherently intermingled anyway. SOA Web Services is the core of what is using the grid at Wachovia.

Buzz Moschetti, Chief Architecture Office, Bear Stearns, also gave a decent outline of the challenges he is facing in the bank. Bear uses an internally developed grid for cash/fixed income and Datasynapse for Credit Derivatives and Calypso. One of the ironies of building the grid is that it can certainly add complexity to the infrastructure, and can cause a lot more issues and side-effects if portions of the grid get max-ed out in terms of utilization, or if they fail. Key thing to figure out include: a) capacity planning which requires a detailed view of business priorities and processes, b) inconsistent platform configuration — you need to create a manifest, c) the challenges around versioning and incremental upgrading of software (and hardware) across grid nodes, d) Epic-scale policy design – reflecting Bobn Hintze’s comments about “Keeping It Simple Stupid”, there is no way to manage all aspects of all applications through policy while being able to be sensitive to changes in the environment: you are better off keeping a clean view on what’s going on and go from there.

HP and IBM both pitched in on a few panels and seemed to be promoting open standards, better hardware acceleration (FPGA, etc) and being more agile & flexbile in managing the environments. They both pointed out that there are indeed issues with developers being able to adopt grid technology in their applications.

Adrian Kunzel gave a presentation in the afternoon which discuss the natural tension between the managers of datacenters and grids whereby the datacenter guys are always looking to optimize capacity and standardize/commoditize the hardware, while the grid guys are looking for additional workloads and machines to run them on. He then went on to discuss the how virtualization is “an end, not a means” and that the ultimate end-game of the grid world is to increase utilization. Some other interesting points that he made included:

  • Development & Testing: virtualization can give you a lot of immediate benefit since the provisioning of a new dev environment can be done easily and can be run on cheaper machines. This is especially good for self-contained applications, and you can blow away the environment if you screw it up. (Sidenote: We have just completed a project where there was a virtual QA environment and it really sucked because they only allocated a total of 500MB RAM to the entire environment)
  • Vertical Scaling: the only way to vertically scale applications is by cycle scavenging to reclaim headroom, so that option runs out of steam pretty quickly.
  • Horizontal Scaling: this clearly can add capacity, but the issue is the speed of provisioning is too slow and there is a lack of coherent monitoring. Provisioning technologies are trying to keep up, but falling behind.
  • Grids are pretty good for isolation and partitioning, providing controls and reasonable workload scheduling
  • Virtualization does not have that many tools and has no distribution mechanics.

Bottom line, the banks are constantly struggling to get better asset utilization and more compute capacity while trying to reduce costs. What the banks really need is a new approach to modeling OS interactions and resource consumption, while also developing the provisioning technologies that support distributed systems.

Other common points made during the day:

  • Manageability is most important to provide better tools for scheduling and allocation of tasks to certain portions of the grid, and to be able to understand the correlation of workload across disparate resources (by location, business, etc). There will always be hotspots, so better understanding utilization is key.
  • Open Standards — almost everyone agreed that there needs to be better collaboration in the industry in creating standard APIs for accessing grids and for the vendors to provide a way to abstract the bank’s infrastructure and applications away from specific grid implementations. Adrian Kunzel felt that we should be able to do this NOW and we need to bring bank’s own experience to bear alongside the mainly academic contributions to date on grid. He also felt data fabrics / caching was about 3-5 years behind compute grid in this regard. Others agreed that there need to be more focus on solving business problems than IT issues. They also felt that the standardization of grid APIs would help people like Murex and Calypso as they already face the challenge of supporting multiple grid vendors’ infrastructures. Additionally, the industry needs to challenge the virtual machine guys to create a standard format.

Distributed Cache Provider Updates February 16, 2007

Posted by newyorkscot in HPC.
2 comments

I see GemStone has announced that it now has support for native C++ and .NET clients to access their distributed caching product, Gemfire. Will be interesting to see what the uptake of that is on the desktop of traders, relative to their competitors in the marketplace. Unlike the products of their competitors (Tangosol  and  Gigaspaces) which are written in Java, Gemfire’s enterprise product actually comes in two flavours, Java and C++.

Meanwhile Tangosol, have recently announced support for .NET applications to use their Coherence data cache, they now also support Spring to allow the management of the data cache.

Not to be outdone, Gigaspaces also recently announced support for .NET and “introduced the concept” of PONOs (Plain Old .NET Objects) to allow consistency of programming models across Java and .NET worlds.

General Purpose Computations on GPUs September 28, 2006

Posted by newyorkscot in HPC.
add a comment

Lab49‘s Damien Morton has been doing work on “General Purpose Computations on Graphics Processing Units”.  His work is written up in a whitepaper and can be found here.

Matt also references some further chip core stuff here  and notes further GPU references.

High Performance on Wall St: FPGA September 20, 2006

Posted by newyorkscot in HPC.
1 comment so far

Out of all the sessions the one that was the most interesting was one on Field Programmable Gate Arrays (FPGAs) – it also actually followed the outline described in the glossy brochure!

Although FPGAs can deliver up to 1000x faster performance than CPUs, the implementation may actually result in performance gains in the order of 40x or 200x since the developer needs to strike a balance between designing for pure performance and flexibility of functionality. Quoting the example that was given, the presenter had built a Monte Carlo simulation on a 15W FPGA chip that was 230x faster than a 3Ghz CPU, but in another solution, calculations were only 40x faster as they traded performance for flexibility. 

It would seem that most of the IBs are looking at Proof Of Concepts of FPGAs, and possibly implementing a “golden node” inside a regular grid.

One of the key messages was the relative difficulty in implementing FPGA solutions:

  • Requires a higher ratio of engineering skills to modelling
  • It is a human process rather than an automated one.
  • Higher Development costs (there exists a 20-80 rule in that 80% of the work delivers only 20% incremental performance gain)

That said, the capital costs and operating expenses is considerably lower. E.g compare a 100,000 node CPU grid with a 100 node FPGA grid .. for same performance, although it will be harder to implement, it will be cheaper to run & operate

Although anyone trying to get into this field needs new engineering skills, equipment, etc, it seems that the only way that adoption is really going to happen is by convincing business users of the potential upside and getting them to sponsor the program. Functionally, it was mentioned that the best types of applications are either a) functionality that is relatively stable (for high throughout computation for well known models) or b) high-value functionality that merits high performance (scenario-based risk analysis for complex credit derivatives).

Building solutions on FPGAs does require a new engineering approach versus CPU-based solutions as you have to design for acceleration. Upfront design based on requirements is VERY important and highly human-based.

Other Related Stuff:

At Lab49, Damien Morton has done a bunch of work on GPUs . It will be interesting to see which way the banks go with non-CPU solutions.

Matt previously posted some info on FPGAs here

High Performance on Wall St: IBM-fest September 20, 2006

Posted by newyorkscot in HPC.
add a comment

The second session of the conference was succinctly called “Enabling Financial Analytics for Competitive Advantage via High Performance, Scalable and Flexible Infrastructure” aka Big-Blue-Tooting-Its-Horn.

I guess the headline was the new form factor of their Blue Gene next-generation hybrid supercomputers which leverages their Cell Broadband Engine Architecture. Each rack in this multi-core (AMD/Intel) machine  has 1 Terabyte of memory and supports 2048 threads. It was described as a giant memory stick studded with chips and “pointers” where the 16,000 nodes can deliver 8 Terabytes of memory and perform 90 Gigachases/sec. IBM have built this machine with no moving parts (e.g. the fans are separate), helping to keep the temperature down. In terms of I/O integration the “memory stick” is studded with I/O chips to allow grid integration with both Datasynapse and Platform. Apparently, this can keep scaling, but starts to run into memory constraints.

Next up was the General Parallel File System (GPFS) which assists in the scaling of file servers and avoids the bottlenecks of NFS/SAN based file systems. The idea is that any node can read to/from any (shared) disk in the system. GPFS is not a client-server FS and stores metadata with the files and has no single metadata server. Performance-wise it allows access at the rate of 15GB/s for any single node and 100GB/s against any single file; supports 100s of nodes; and over 200Terabytes of storage.

IBM’s new BladeCenter was also profiled in terms of dealing with network latency, improved power output/heat density and supporting virtualization to control loads.

Finally, we got to the “Latency Stack” (not sure I want to buy a stack of latency!). This is another way of packaging all the new Websphere stuff that includes Websphere Extended Deployment (XD), Websphere Front Office for Financial Markets (mainly deals with streaming data apparently), and Websphere Realtime 1.0 (which has JVM extensions and allows control of the Garbage Collector and Ahead of Time (AOT) Compliation.

High Performance on Wall St September 20, 2006

Posted by newyorkscot in HPC.
add a comment

I attended the 2006 High Performance on Wall Street event yesterday and went to a bunch of the sessions. This event was considerably smaller and more focused than most events, and most of the usual vendors were exhibiting, including a lot of hardware vendors. Some of the major comments from various sessions are below.

General Panel Comments  – on a couple of the sessions, each panelist got to pitch their product or success story and give some insights. One key message that was reiterated several times was the need for better benchmarking and standards around grid. Some other specific comments included:

  • Lehman’s Thanos Mitsolides claimed the toughest issues are around initialization and stateless execution. The problems not just about performance, but scalability, security, load balancing, etc are very important.
  • BankOf America’s Andy Doddington – Being at the “top of the stack” there are needling issues at all layers in the stack. In the final session he re-iterated Thanos’ comments about HPC not just being about performance: management, visibility, ease of deployment and simplicity are very important too and stressed that vendors need to keep all of this in mind in their products. He was also complementary about Javaspaces as an easy way to manage data.
  • Wombat said that there are lots of vendors coming up with what is essentially the same solutions. There needs to be standard measurements of products, as well as a better understanding as to where certain products apply to certain problem domains.
  • Reuters’ issues are around the transformation of data and fan-out. They would like to see standard APIs from people like Intel, etc to support local transformation because they dont want to be hardware-dependent.

Visions for Future – in the opening session, some of the panelists were asked about what they see for the future:

  • Lehman was concerned about the growth of clusters and what that will do to the scalability of the file system and backend cache infrsatructures.
  • Gemstone said that they look forward to there being more stateful applications across more functions.
  • Technology Business Development Corp said they want to see more benchmarks.
  • Wombat  thinks that the issues going forward is not the technology itself but rather in technology management, so as volumes grow systems have to work with the datacenters/grids in place and so software efficiency will become more important.
  • Platform thinks that the Quality Of Service and performance (fault tolerance, resiliency, measurement) are important especially as grid grow to 20,000 nodes in size. They also believe service-orientated infrastructures will be very important.
  • Reuters expressed their needs for standards and benchmarking as well as monitoring of real-time latency. They also think that people need to focus on better infrastructure and how it is structured in the network.

More specific posts to follow.

Quocirca’s Grid Index September 5, 2006

Posted by newyorkscot in HPC.
add a comment

I see Quocirca (and Oracle – they commissioned the article) have updated their Grid Index which shows “how initial pilots of Grid computing are now moving towards full implementations”.

According to the article, enterprise-wide grids are still rare and tend to favour more discrete cluster grids – in FS this is kind of consistent with business-aligned implementations of vendors such as Datasynapse and Platform. I also thought it was interesting that the US leads in adoption rates (we see the opposite effect in Financial Services where London is ahead of New York, for example).

There is mention of a tight correlation between localized SOA implementations and the use of grids, while broad-scale SOA adoption and grids are much looser correlated. This is not very surpising, as I have not seen many broad-scale SOAs in the first place, and even business-aligned SOAs in finance seem to have had limited success (as a real service-orientated architecture versus component-based architectures which is what most companies end up doing). At least the article’s conclusion about there being a low level of knowledge around SOAs confirms this. Not so sure that the reason is that companies do not want to combine too many new technbologies into one larger project, or if it is because businesses tend to be more stove-piped in general.

One thing I did not see referenced is the implementation of data grids, and distributed memory solutions in general which are definitely enjoying some growth in FS. Would also like to see a similiar study done that included virtualization and how it is actually being used.