Conor O'Mahony's Database Diary

Your source of IBM database software news (DB2, Informix, Hadoop, & more)

Announcing DB2 10 & InfoSphere Warehouse 10

with one comment

Today, IBM announced DB2 10 and InfoSphere Warehouse 10. You can read about these new releases in the Press Release and on the IBM Web site. Highlights of the new releases include:

  • Performance is up to 3.3x faster than previous release for complex query workloads*
  • The new Adaptive Compression has provided 7x or greater overall space savings for more than one client, with some tables achieving 10x space savings**
  • In DB2 10 Early Access Program testing, DB2 obtained an average of 98% compatibility with Oracle PL/SQL***
  • DB2 NoSQL Graph Store Accelerates Rational Use Case by up to 3.5x****

Check out the following great video from one of our Early Access Program participants:

For more information about these releases, make sure to visit the Launch Virtual Event.

* Based on internal tests of IBM DB2 9.7 FP3 vs. DB2 10.1 with new compression features on P6-550 systems with comparable specifications using data warehouse / decision support workloads, as of 4/3/2012.
** Based on client testing in the DB2 10 Early Access Program.
*** Based on internal tests and reported client experience from 28 Sep 2011 to 07 Mar 2012.
**** Based on internal benchmark tests of Rational Jazz graph store usage, comparing DB2 10 Graph Store with Jena TDB version 0.8.10.

Written by Conor O'Mahony

April 3, 2012 at 9:32 am

Posted in DB2 for LUW

Tagged with

Baltic Bank Moves from Oracle to DB2 to Improve Performance, Lower Costs, and Increase Availability

with one comment

JSC Rietumu Banka is one of the largest banks in the Baltic states. They recently migrated their data from Oracle Database on Sun servers to IBM DB2 on Power Systems servers, and enjoyed the following bebefits:

  • Up to 30 times faster query performance
  • 20-30% reduction in total cost of ownership
  • 200% improvement in data availability

Like many major banks, JSC Rietumu Banka faced recent pressure to reduce IT costs. In particular, they were concerned with total cost of hardware, software, and staffing for their banking applications which used Oracle Database on Sun servers. After a thorough technical and financial evaluation, JSC Rietumu Banka chose to migrate their environment to DB2 on Power Systems servers.

Of course, the ease of migration was a significant factor in JSC Rietumu Banka being able to achieve these benefits. For more information about the “compatibility features” that make it easy to migrate from Oracle Database to IBM DB2, see Gartner: IBM DB2′s Maturing Oracle Compatibility Presents Opportunities, with some Limitations.

To learn more about this specific migration, read the full IBM case study.

Written by Conor O'Mahony

March 4, 2012 at 9:05 pm

How IBM and Oracle Approach Big Data Solutions

leave a comment »

This blog posts refers to the definition of Big Data commonly in use today. I do not include mainframe-based solutions, which some people might argue tackle Big Data challenges.

Both IBM and Oracle are going after the Big Data market. However, they are taking different approaches. I’m going to take a few moments to have a very brief look at what both companies are doing.

First of all, Oracle have introduced an “appliance” for Big Data. IBM have not. I put the word appliance in quotes because I consider this Oracle appliance to be closer in nature to an integrated collection of hardware and software components, rather than a true appliance that is designed for ease of operation. But the more important consideration is whether an appliance even makes sense for Big Data. There is a decent examination of this topic in the following blog post from Curt Monash and the accompanying comment stream: Why you would want an appliance — and when you wouldn’t. But, regardless of your position on this subject, the fact remains that Oracle currently propose an appliance-based approach, while IBM does not.

The other area I will briefly look at is the scope of the respective vendor approaches. In the press release announcing the Oracle Big Data Appliance, Oracle claim that:

Oracle Big Data Appliance is an engineered system optimized for acquiring, organizing, and loading unstructured data into Oracle Database 11g.

IBM takes a very different approach. IBM does not see its Big Data platform as primarily being a feeder for its relational database products. Instead, IBM sees this as being one possible use case. However, the way that customers want to use Big Data technologies extend well beyond that use case. IBM is designing its Big Data platform to cater for a wide variety of solutions, some of which involve relational solutions and some of which do not. For instance, the IBM Big Data platform includes:

  • BigInsights for Hadoop-based data processing (regardless of the destination of the data)
  • Streams for analyzing data in motion (where you don’t necessarily store the data)
  • TimeSeries for smart meter and sensor data management
  • and more

So, as you can see, there are fundamental differences in the ways that IBM and Oracle are developing products for Big Data solutions. For more information, see IBM Big Data and Oracle Big Data.

NYSE Euronext uses Netezza to Manage their “Big Data”

leave a comment »

NYSE Euronext operates multiple securities exchanges, including the New York Stock Exchange and Euronext. As you might imagine, securities exchanges present significant data management challenges. But NYSE Euronext didn’t just want to have a transactional system, they wanted to do much more with their data, further increasing the challenges. At the 2011 IBM Information On Demand (IOD) conference, NYSE Euronext described their challenges and the solution they chose. In particular, they highlight Netezza’s tremendous performance and how fast it is to get up-and-running with Netezza.

Not only is it easy to get up-and-running with Netezza, but it is easy to manage your environment on an ongoing basis. You can hear for yourself in this short video segment…

Written by Conor O'Mahony

February 24, 2012 at 12:02 pm

Coca Cola Bottling Move from Oracle Database to IBM DB2

with one comment

At the 2011 IBM Information On Demand (IOD) Conference, Coca Cola Bottling spoke about their experiences when moving from Oracle Database to IBM DB2. I have included some very brief video segments shot at the conference below. It is really interesting to heard about the experiences and impact of switching from Oracle to IBM from the people involved.

In the following short video segment, hear how Coca Cola Bottling have changed their fix pack philosophy as a result of moving. With Oracle Database, they would avoid fix packs unless they “had to”. But with DB2, applying fix packs is much easier and faster, providing faster access to new functionality, performance improvements, and bug fixes. Also, hear about how Coca Cola Bottling have had significant data storage savings thanks to moving to DB2. Who wouldn’t want to reclaim some of that IT budget allocated for storage purchases 🙂

And finally, hear about their experiences with performance boosts and the autonomic computing capabilities in DB2.

Written by Conor O'Mahony

February 23, 2012 at 11:07 am

Get a Free Copy of the Forrester Wave™ for Enterprise Hadoop Solutions

with 2 comments

Today, Forrester published its Wave analysis for enterprise Hadoop solutions. It has detailed coverage of the Hadoop solutions from vendors like IBM, MapR, Cloudera, Hortonworks, and others. If you are considering an enterprise Hadoop solution, such as IBM InfoSphere BigInsights, it will make for very interesting reading. You can download a free copy of the report from The Forrester Wave™: Enterprise Hadoop Solutions, Q1 2012.

Written by Conor O'Mahony

February 2, 2012 at 2:41 pm

Oracle Reduce their Exadata Projections

with 5 comments

In June of last year, during Oracle’s FYQ4 2011 earnings call, Larry Ellison claimed that Oracle expect more than 2,000 Exadata systems to be installed in fiscal year 2012. His exact quote follows. You can read the full transcript on SeekingAlpha at Oracle’s CEO Discusses Q4 2011 Results.

Today, more than 1,000 Exadatas are installed, and we plan on tripling that number this year.

He noted that more than 1,000 systems had been installed at that time. Tripling this number yields more than 3,000. This implies that there would be more than two thousand new systems installed in FY2012.

Last month, during Oracle’s FYQ2 2012 earnings call, Larry Ellison said:

This past Q2, Oracle sold over 200 Exadata and Exalogic engineered systems. In Q3, we plan to sell over 300 Exadata and Exalogic engineered systems. In Q4, we plan to sell over 400 Exadata and Exalogic engineered systems.

Again, the full transcript is available on SeekingAlpha at Oracle’s CEO Discusses Q2 2012 Results. There is no reference to Q1 sales, but Oracle projects that Q2 + Q3 + Q4 sales of both Exadata and Exalogic will be more than 900.

A couple of things stand out here. The first is that these latest projections from Oracle are for both Exadata and Exalogic systems combined, whereas the original projection was for Exadata systems only. The second is that these latest projections from Oracle are significantly down (more than 2,000 has been revised down to whatever business they did in Q1 + more than 900 in Q2, Q3, and Q4 combined). And this significant downward revision in projections has happened in the space of just 6 months.

If you read the Q&A segment from the Q2 earnings call, it is quite interesting. An analyst asks Oracle about the downward revision in projections. There are some semi-coherent responses from Ellison and Hurd, before Hurd claims that instead of 3x growth in engineered systems, they are on track for 2.5x growth. Hmmm, unless they had a monster Q1, that doesn’t quite add up either 🙂

Written by Conor O'Mahony

January 30, 2012 at 1:17 pm

Posted in Oracle Exadata

Tagged with ,

Win a Trip to the IDUG Conference of your Choice

leave a comment »

DB2Night ShowThe International DB2 User Group (IDUG) is a user-run organization. If you want independent information about DB2, IDUG is the place to go. This year, IDUG are have conferences in the US (Denver), Germany (Berlin), and Australia (Sydney). The good news is that the DB2night Show is holding a contest, and the prize is an all expenses-paid trip to the IDUG conference of your choice. The contest aims to identify new users who can speak about their experiences with DB2. It’s a talent contest of sorts, where the talent is sharing your experiences. If you have ever considered speaking at a conference, this contest is the ideal way to see how you might do in a fun setting.

Written by Conor O'Mahony

January 25, 2012 at 2:01 pm

Anatomy of an Oracle Marketing Claim

with 10 comments

Yesterday, Oracle announced a new TPC-C benchmark result. They claim:

In this benchmark, the Sun Fire X4800 M2 server equipped with eight Intel® Xeon® E7-8870 processors and 4TB of Samsung’s Green DDR3 memory, is nearly 3x faster than the best published eight-processor result posted by an IBM p570 server equipped with eight Power 6 processors and running DB2. Moreover, Oracle Database 11g running on the Sun Fire X4800 M2 server is nearly 60 percent faster than the best DB2 result running on IBM’s x86 server.

Let’s have a closer look at this claim, starting with the first part: “nearly 3x faster than the best published eight-processor result posted by an IBM p570 server“. Interestingly, Oracle do not lead by comparing their new leading x86 result with IBM’s leading x86 result. Instead they choose to compare their new result to an IBM result from 2007, exploiting the fact that even though this IBM result was on a different platform, it uses the same number of processors. Of course, we all know that the advances in hardware, storage, networking, and software technology over half a decade are simply too great to form any basis for reasonable comparison. Thankfully, most people will see straight through this shallow attempt by Oracle to make themselves look better than they are. I cannot imagine any reasonable person claiming that Oracle’s x86 solutions offer 3x the performance of IBM’s Power Systems solutions, when comparing today’s technology. I’m sure most people will agree that this first comparison is simply meaningless.

Okay, now let’s look at the second claim: “nearly 60 percent faster than the best DB2 result running on IBM’s x86 server“. Oracle now compare their new leading x86 result with IBM’s leading x86 result. However, if you look at the benchmark details, you will see that IBM’s result uses half the number of CPU processors, CPU cores, and CPU threads. If you look at performance per core, the Oracle result achieves 60,046 tpmC per CPU core, while the IBM result achieves 75,367 tpmC per core. While Oracle claims to be 60% faster, if you take into account relevant system size and determine the performance per core, IBM is actually 25% faster than Oracle.

Finally, let’s not forget the price/performance metric from these benchmark results. This new Oracle result achieved US$.98/tpmC, whereas the leading IBM x86 result achieved US$.59/tpmC. That’s correct, when you determine the cost of processing each transaction for these two benchmark results IBM is 39% less expensive than Oracle. (BTW, I haven’t had a chance yet to determine if Oracle Used their Usual TPC Price/Performance Tactics for this benchmark result, as the result details are not yet available to me; but if they have, the IBM system will prove to be even less expensive again than the Oracle system.)

Benchmark results are as of January 17, 2012: Source: Transaction Processing Performance Council (TPC), http://www.tpc.org.
Oracle result: Oracle Sun Fire X4800 M2 server (8 chips/80 cores/160 threads) – 4,803,718 tpmC, US$.98/tpmC, available 06/26/12.
IBM results: IBM System p 570 server (8 chips/16 cores/32 threads) -1,616,162 tpmC, US$3.54 /tpmC, available 11/21/2007. IBM System x3850 X5 (4 chips/40 cores/80 threads) – 3,014,684 tpmC, US$.59/tpmC, available 09/22/11.

Written by Conor O'Mahony

January 18, 2012 at 11:01 am

Top Posts of 2011

leave a comment »

Its that time of year again. Here are the top posts from this blog in 2011, as judged by number of views.

  1. IBM DB2 Welcomes Oracle Database/HP Itanium Customers
  2. New IBM DB2 vs. Oracle Database Advertising Campaign
  3. A Closer Examination of Oracle’s “Database Performance” Advertisement
  4. Comparing Price for Oracle Exadata and IBM Smart Analytics System
  5. IBM DB2 Strikes Another Blow to Oracle Database

As you can see, there is a strong DB2/Oracle Database competitive theme running through these popular topics. And here are the top posts of 2011, as judged by reader participation. In other words, as judged by the number of comments (or perhaps the amount of controversy).

  1. New IBM DB2 vs. Oracle Database Advertising Campaign (20 comments)
  2. A Closer Examination of Oracle’s “Database Performance” Advertisement (19 comments)
  3. The Future of the NoSQL, SQL, and RDBMS Markets (12 comments)
  4. Update on the IBM DB2 “SQL Skin” for Migrating from Sybase ASE (8 comments)
  5. Industry Benchmark Result for DB2 pureScale: SAP Transaction Banking (TRBK) Benchmark (7 comments)

Written by Conor O'Mahony

December 19, 2011 at 11:00 am

Posted in Uncategorized

Deploying DB2 and InfoSphere Warehouse on Private Clouds

leave a comment »

Cloud computing is certainly a hot topic these days. If an organization is not already using cloud computing, it has plans to do so. The economics, agility, and value offered by cloud computing is just too persuasive for IT organizations ignore.

Even the high-profile Amazon outage couldn’t slow cloud computing’s relentless march towards mainstream adoption. If anything, that outage helped make cloud computing more robust by highlighting the need for hardened policies and procedures around provisioning in the cloud.

IBM recently announced updates to a set of products that make it easy to deploy DB2 and InfoSphere Warehouse on private clouds:

  • IBM Workload Deployer (previously know as WebSphere CloudBurst), which is a hardware/software appliance that streamlines the deployment and management of software on private clouds.
  • IBM Transactional Database Pattern, which works with the IBM Workload Deployer to generate DB2 instances that are suitable for transactional workloads.
  • IBM Data Mart Pattern, which generates InfoSphere Warehouse instances for data mart workloads.

These patterns consist of more than just deploying virtual images with pre-configured software. You should instead think of them as being like mini-applications for configuring and deploying a cloud-based database instances. Users specify information about the database, and then the pattern builds and deploys the database instance.

The Transactional Database Pattern is for OLTP deployments. It includes templates for sizing the virtual machine, database backup scheduling, database deployment cloning capabilities, and tooling (including Data Studio). The Data Mart Pattern incorporates the features to the OLTP pattern, together with deep compression and data movement tools. But, of course, it is configured and optimized for data mart workloads in a virtual environment.

Written by Conor O'Mahony

December 12, 2011 at 5:40 pm

Need Help Determining Hadoop Split Sizes? Use Adaptive MapReduce Instead!

with 2 comments

IBM is actively working on adaptive features for the Map and Reduce phases of its InfoSphere BigInsights product (which is based on Apache Hadoop). In some cases, this involves applying techniques commonly found in mature data management products, and in some cases it involves developing new techniques. While a number of these adaptive features are still under development, there are some features in the product today. For instance, BigInsights currently includes an Adaptive Mapper capability that allows Mappers to successively process multiple splits for a job, and avoid the start-up costs for subsequent splits.

When a MapReduce job begins, Hadoop divides the data into multiple splits. It then creates Mapper tasks for each split. Hadoop deploys the first wave of Mapper tasks to the available processors. Then, as Mapper tasks complete, Hadoop deploys the next Mapper tasks in the queue to the available processors. However, each Mapper task has a start-up cost, and that start-up cost is repeated each time a Mapper task starts.

With BigInsights, there is not a separate Mapper task for each split. Instead, BigInsights creates Mapper tasks on each available processor, and those Mapper tasks successively process the splits. This means that BigInsights significantly reduces the Mapper start-up cost. You can see the results of a benchmark for a set-similarity join workload in the following chart. In this case, the tasks have a high start-up cost. The AM bar (Adaptive Mapper) in the chart is based on a 32MB split size. You can see that by avoiding the recurring start-up costs, you can significantly improve performance.

Adaptive MapReduce Benchmark: Set-Similarity Join Workload

Of course, if you chose the largest split size (2GB), you would achieve similar results to the Adaptive Mapper. However, the you might potentially expose yourself to the imbalanced workloads that sometimes accompany very large splits.

The following chart shows the results of a benchmark for a join query on TERASORT records. Again the AM bar (Adaptive Mapper) in the chart is based on a 32MB split size.

Adaptive MapReduce Benchmark: TERASORT Join Workload

In this case, the Adaptive Mapper results in a more modest performance improvement. Although, it is still an improvement. The key benefit of these Adaptive MapReduce features is that they eliminate some of the hassles associated with determining the split sizes, while also improving performance.

As I mentioned earlier in this post, a number of additional Adaptive MapReduce features are currently in development for future versions of BigInsights. I look forward to telling you about them when they are released…

In the mean time, make sure to check out the free online Hadoop courses at Big Data University. I previous blogged about my experiences with these courses in Hadoop Fundamentals Course on BigDataUniversity.com.

Written by Conor O'Mahony

December 7, 2011 at 1:07 pm

Comparing HDFS and GPFS for Hadoop

leave a comment »

Here is a chart that compares the performance of Hadoop Distributed File System (HDFS) with General Parallel File System-Shared Nothing Cluster (GPFS-SNC) for certain Hadoop-based workloads (it comes from the Understanding Big Data book). As you can see, GPFS-SNC easily out-performs HDFS. In fact, the book claims that a 10-node GPFS-SNC-based Hadoop cluster can match the performance of a 16-node HDFS-based Hadoop cluster.

Comparing HDFS and GPFS for Hadoop Workloads

GPFS was developed by IBM in the 1990s for high-performance computing applications. It has been used in many of the world’s fastest computers (including Blue Gene and Watson). Recently, IBM extended GPFS to develop GPFS-SNC, which is suitable for Hadoop environments. A key difference between GPFS-SNC and HDFS is that GPFS-SNC is a kernel-level file system, whereas HDFS runs on top of the operating system. This means that GPFS-SNC offers several advantages over HDFS, including:

  • Better performance
  • Storage flexibility
  • Concurrent read/write
  • Improved security

If you are interested in seeing how GPFS-SNC performs in your Hadoop cluster, please contact IBM. Although GPFS-SNC is not in the current release of InfoSphere BigInsights (IBM’s Hadoop-based product), GPFS-SNC is currently available to select clients as a technology preview.

Written by Conor O'Mahony

November 30, 2011 at 1:07 pm

Informix Users are Going to San Diego

leave a comment »

It has just been announced that next year’s International Informix Users Group (IIUG) conference will be held in San Diego, California on 22 – 25 April. The IIUG Conference continues to offer incredible value. Sign up soon to get the $695 early bird rate, and if you sign up for free IIUG membership, you even get $100 off that rate. $595 for a conference of this length and quality is amazing value. But you’re going to have to act fast to get this discount rate!

And, don’t forget that San Diego is such a great city to visit. Not only is it a wonderful city with an ideal year-round climate. But it also has fantastic array of attractions like the world-famous San Diego Zoo, Sea World, LEGO land, and the Zoo Safari Park (a personal favorite).

International Informix Users Group (IIUG) Conference

Written by Conor O'Mahony

November 30, 2011 at 9:22 am

Posted in DBA, IIUG, Informix

Tagged with , ,

Highlights from the IDUG EMEA Conference

leave a comment »

DB2Night ShowI’m still in the afterglow of the International DB2 User Group (IDUG) conference in Prague, Czech Republic. It was another great conference at a great facility in a great city. The conference organizers should be commended on a truly outstanding event. Its incredible to think that the conference organizers are user volunteers, and not professional conference planners! I’m already looking forward to the next IDUG EMEA conference in Berlin next year. If you are interested in a more in-depth discussion of the conference, including lessons learned from the technical sessions, Norberto Filho will be appearing on the DB2Night show on Friday 02 December 2011. Even if you were at the conference, there was so much happening there that you are sure to learn something new from Norberto’s experiences.

Written by Conor O'Mahony

November 30, 2011 at 8:29 am

Posted in DB2 for LUW, DB2 for z/OS, DBA, IDUG

Tagged with ,

%d bloggers like this: