DB2 for LUW and DB2 for z/OS are Converging
As many of you know, the features and functions in the mainframe version of DB2 (DB2 for z/OS) and the distributed version (DB2 for Linux, Unix, and Windows) have been converging for a number of years.
These two flavors of DB2 now share much of the same SQL syntax. You can read the details in SQL Reference for Cross-Platform Development. There are some differences in syntax, because the platforms – hardware and software – have different strengths, and the SQL is tailored to take advantages of those strengths to deliver the best performance, scalability, availability, and security on each platform. However, the differences in syntax are being minimized, and are the exception rather than the rule.
The drivers, services, and tooling have also been converging. The Data Studio tooling provides common services for all platforms, allowing you to easily administer DB2 across all platforms.
Also, features like native XML data management (pureXML), clustering (pureScale), and compression (Deep Compression) use similar architectures and algorithms across both of these flavors of DB2. For instance, pureScale on Unix uses the same architecture and approach as the Data Sharing technology in Parallel Sysplex on the mainframe. And, of course, the Data Sharing technology is recognized as the most effective and efficient scale-out architecture in the industry. Similarly, IBM recently brought compression techniques from the mainframe to Linux, Unix, and Windows to even further stretch that flavor of DB2’s lead in storage optimization on distributed systems.
So, while these two flavors of DB2 are not identical, they are coming closer together from both a feature and a function point-of-view. Their interfaces are also converging, and currently share a high degree of commonality. This convergance is making it easier for many organizations who use both flavors of DB2 to re-use their DB2 skills on the other platform. I have had numerous conversations with DB2 for z/OS users who have easily picked up DB2 for LUW skills, and I am happy to let you know that this will become even easier as we move forward, and this convergence continues in future releases of DB2.
For another perspective on how the mainframe and distributed versions of DB2 are converging, and for my inspiration for this blog post, see the DB2 Technology Converging article in the IDUG Solutions Journal.
DB2 pureScale Cluster Scale-out Efficiency
I mentioned earlier this week that IBM has published scale-out numbers for DB2 pureScale. I thought I’d take a few minutes to let you know about those numbers. To measure the efficiency of scaling out a system, IBM successively added DB2 servers to form larger and larger clusters. A workload that is representative of a Web commerce application (90% read and 10% write) was used. Unlike many Oracle RAC benchmarks, the IBM lab tests did not require any cluster awareness (i.e. transactions were completely random, with no need for transaction routing to specific nodes). The diagram below shows the results. Up to 64 nodes in the cluster, the scalability (compared to the 1 member result) is above 95% and at 128 nodes the scalability is at 84%.

By the way, if you want to see how easy it is to add or remove nodes in the cluster, look at the following video. Please remember that the DB2 pureScale approach does not require any re-partitioning of data, either manual or automated, when nodes are added or removed. This is because DB2 pureScale is based on the approach that has been successfully running on the mainframe for decades, which takes advantage of centralized locking to provide the ultimate database scale-out efficiency.
Larry Ellison Tilting at Windmills
All Larry Ellison’s recent bluster about IBM reminded me of Don Quixote tilting at windmills. So I thought I’d try my hand at capturing that thought. As you can see, I have extremely limited artistic skills. So, apologies in advance for my rudimentary drawing abilities. But I did have some fun spending 20-30 minutes trying to put this down on paper.

Are Larry Ellison and Oracle out of Touch with the Market?
Larry Ellison was his usual entertaining self on yesterday’s Webcast announcing the Oracle/Sun strategy. However, some of his statements were so far from reality that it left me wondering if he knows what’s happening in the database market. For instance, he crowed “the Oracle Database scales out, IBM DB2 for Unix does not. Let me see, how many servers can IBM put together for an OLTP application? Let’s see, how many can they group together? Um, one. They can have up to one server attacking really big jobs. When they need more capacity, they make that server bigger. And then they take the old server out, put a bigger one in. And when you’ve got the biggest server, that’s it. That’s all the can do for OLTP“. Actually Larry, you couldn’t be more wrong. DB2 pureScale supports up to 128 nodes. And, not only that, but IBM has published scale-out numbers for DB2 pureScale, something that we would love to see Oracle provide for Real Application Clusters (RAC).
He also claimed that DB2 “can’t scale out, they can’t do cloud, they can’t do clusters, the can’t do any of this“. The fact is, DB2 offers both pureScale (using a shared-disk architecture) and DPF (using a shared-nothing architecture). This allows users to choose the approach that best suits their environment. Oracle, on the other hand, offers only a shared-disk RAC architecture. Some might argue that Exadata simulates a form of shared-nothing backend for Oracle. However, this is essentially a band-aid that attempts to address scalability issues by throwing more hardware at the problem.
As regards whether DB2 can do the cloud, many people would argue that a shared-nothing architecture like DB2 DPF is the best approach for private clouds. Also, don’t forget that DB2 already has multiple cloud offerings in the marketplace, including DB2 on Amazon Elastic Compute Cloud (EC2).
Finally Larry claimed that “you would’ve thought, years ago, that IBM would have come out with a database machine. I mean its so obvious, they’ve got hardware, they’ve got DB2. Why in the world didn’t they come out with a database machine? It’s fascinating“. Actually, IBM has come out with such an offering. IBM has been doing this since we introduced the Balanced Warehouse to the market in 2007. The most recent examples of such systems that IBM has brought to market include DB2 pureScale and the IBM Smart Analytics System (which includes the entire stack needed for analytics, from hardware through ETL, warehousing, reporting, and analytics).
I really am astonished that Larry Ellison would make such fundamental factual errors on such a prominent Webcast.
Want to go to Rome?
This year the European IBM Information on Demand (IOD) conference is in Rome, Italy. It runs from 19 May to 21 May 2010 in the Marriot Park Hotel. To learn more about the conference, see the IBM Information on Demand EMEA Conference Web site.
IBM is currently accepting proposals for speaking sessions at the conference. IBM is particularly interested in securing DB2 users to speak. If you are a DB2 user and have something interesting to share, please submit a proposal. Interesting topics include sharing your experiences with a particular DB2 feature, best practices you observed when working with DB2, or literally anything else that you think other potential DB2 users can benefit from hearing. To submit a proposal, go to the Call for Speakers Web page.
Free Class: IBM Software on Amazon Web Services (AWS)
Are you interested in deploying IBM software on a scalable, secure, and on-demand cloud environment? If so, then you will be interested in a free 2-day class that IBM is offering in the Toronto area. Attendees will receive hands-on knowledge of the AWS building blocks. You will learn about the services at feature, function, and philosophy levels that will help you to understand the intersection points between AWS and IBM. You will also learn specific details about using IBM software like DB2, Informix, and WebSphere sMash in Amazon Web Services (AWS) environments. The class covers a wide variety of technical and non-technical topics, such as:
- Cloud computing, virtualization, and AWS tools and technologies
- IBM products and cloud computing solutions available on AWS
- How to leverage existing IBM and AWS technologies to achieve Software as a Service
- Hands-on sessions with AWS technologies and various IBM products on Amazon Machine Images
- And more…
The class runs from 25 Jan 2010 to 26 Jan 2010. For more information and to sign up, see Amazon Web Services (AWS) training
News about IDUG Conferences
If you are a DB2 user, make sure to join the recently created LinkedIn groups for the IDUG conferences. You will get all the latest news about locations, speakers, special offers, and more. Here are the links:
IBM Data Management Partner Bootcamps now Online
If you are an IBM partner who works with DB2, I recommend that you check out the recent additions to the IBM Learner Portal. Each year, the IBM Data Management team travel the world to host their popular bootcamps in classrooms near you. These bootcamps consist of both educational presentations and hands-on training. There are bootcamps for a variety of products, including DB2, Informix, and Optim. Now, IBM are making it even easier for you to access this high-quality technical content by adding the bootcamps to the IBM Learner Portal. That’s right, you can access these bootcamps from the comfort of your own desk. Not only that, but the topics in each bootcamp are accessible in a modular format, allowing quick access to individual topics.
Clarifying Some Recent Oracle Benchmark Claims
Earlier this month, Oracle issued a press release regarding their latest results for the SAP® Business Intelligence-Data Mart (BI-D) Standard Application Benchmark. In the press release, Oracle claim that the new Oracle Database result surpasses the best IBM DB2 result by more than six times. While this statement is true, it is a little misleading. You see IBM does not perform the SAP BI-D benchmark tests for DB2 on the Linux, Unix, Windows, or z platforms. IBM performs these benchmarks only for DB2 on the i platform. So, when Oracle says that its result surpasses the best IBM DB2 result by more than six times, it is comparing a 4-node Oracle RAC cluster with a single DB2 for i system. Oracle is also comparing a recent benchmark result against an IBM result that is more than a year old. So, while these comparisons seem spectacular at first glance, once you look at the details you realize that they are not entirely fair.
Also, please note that the BI-D benchmark is a read-only benchmark. We have increasingly heard from clients that this benchmark is not realistic for their environments. IBM clients tell us that their systems typically have mixed workloads. For instance, demands for more current data typically result in trickle feeding, and they often need to rebuild cubes while queries are running. For these reasons, IBM is increasingly unlikely to perform the BI-D benchmark tests. Instead, clients are interested in the more realistic SAP BI-MXL which includes inserts, updates, and deletes with the queries.
The other thing to be aware of is that these benchmark results further illustrate the gap in levels of support for SAP applications provided by these database software products. With IBM DB2, you simply set one variable (DB2_WORKLOAD=SAP) and all the required settings are configured automatically for you. However, if you look at the benchmark submission package, you will see that these new benchmark results from Oracle use some interesting configuration settings. Undocumented and unsupported configuration parameters in Oracle Database begin with an underscore. Well, Oracle uses six of these undocumented and unsupported configuration parameters in these benchmark results. I’m sure its not reassuring for Oracle Database users to learn that Oracle needs to use undocumented and unsupported parameters to get optimal performance. The parameters in question are:
- _optimizer_cost_based_transformation= off
- _query_rewrite_fudge = 1
- _improved_row_length_enabled= FALSE
- _optim_peek_user_binds = FALSE
- _optimizer_autostats_job = FALSE
- _optimizer_save_stats = FALSE
There are some interesting settings in here. Apparently, they don’t trust their cost-based optimizer because they disable it. And they appear to be setting some sort of query rewrite fudge factor. I wonder what fudge factor you should choose for optimal performance in your environment. And, if they are disabling it, perhaps the row length setting does not improve things as its name suggests it might
PS. Many thanks to Chris Eaton for his expertise and help with this blog post.
Integrated Systems mean Easier Deployment and Faster Performance
If you attended the 2009 IBM Information on Demand conference, you may have seen the following video. It focuses on how two IBM clients benefit from workload-optimized solutions from IBM. A workload-optimized solution is essentially a single system where all components (hardware, database software, ETL software, business analytics software, etc.) come pre-configured and pre-integrated for optimized operation. IBM offers pre-configured and pre-integrated solutions for both OLTP and OLAP environments, with DB2 pureScale and IBM Smart Analytics System respectively.
Because these integrated systems come pre-configured for optimal performance, you don’t have to worry about integrating, balancing, and tuning the systems during deployment. This saves a lot of time, and ensures a faster time to value for the new system. Farmers Insurance estimate that they saved months of deployment time. Rooms To Go went from nothing to an entire system (extracting, transforming, loading, warehousing, and analyzing data) with dashboards in just two months.
Another benefit of these systems is improved performance. IBM tests and certifies the integrations, ensuring they are configured for optimal operation. IBM also uses established best practices to pre-tune the systems. When Farmers Insurance put their system in place, they immediately saw performance gains of 42%.
Help for Oracle DBAs Who are Moving to DB2
Many Oracle DBAs and programmers are discovering that DB2 has a lot of features that are quite familiar to them. From data types to SQL, from built-in packages to PL/SQL, many of the features that are most familiar to Oracle Database users are now supported in DB2. This support has generated some interest from notable members of the Oracle community like Steven Feuerstein and Lewis Cunningham.
If you are an Oracle DBA or programmer, and you want to learn more, there are a couple of good resources:
- First, a resource that’s free. IBM published a Redbook that describes these new features. You can download the Redbook from Oracle to DB2 Conversion Guide: Compatibility Made Easy
- Second, a resource that costs money. IBM Education has created a new 2-day training course for database administrators, database application designers, and database application programmers who want to learn about moving from an Oracle Database environment to DB2. You can learn more about this course at Oracle to DB2 Enablement Workshop.
IBM Buys Guardium
IBM today announced the acquisition of Guardium, who are based in Waltham, Mass. This is a very exciting acquisition for IBM Information Management. The combination of IBM and Guardium technology is already helping many organizations safeguard data, monitor database activity and reduce operational costs by automating regulatory compliance tasks. The monitoring capabilities of Guardium’s technology also detect fraud and unauthorized access via enterprise applications such as an organization’s ERP, CRM or Data Warehousing solutions. You can get more details from the IBM press release at IBM Acquires Guardium.
IOD EMEA 2010 – Call for Speakers
The IBM Information On Demand EMEA 2010 conference will be held in Rome next year between Tue 18 May and Fri 21 May. The Call for Speakers is now open. Make sure to submit your proposals to speak at the IOD EMEA conference at http://www-01.ibm.com/software/uk/data/conf/programme/call-for-speakers.html. The deadline for proposals is 29 Jan 2010.
Using DB2 pureScale to Eliminate Over-Provisioning of Database Software
DB2 pureScale created quite a stir at the IBM Information on Demand conference. A number of people wanted to know more about how to use the “on-demand capacity” aspect of DB2 pureScale. I thought you would be interested to hear how some clients plan to use DB2 pureScale. I have anonymized the company names and removed any specific details about environments due to the early nature of the engagements/discussions:
- Recently, the “business” people at a US airline wanted to implement a big promotion. They built the promotion, only for the IT department to inform them that the systems would not handle the increased transactional workload, and so they cancelled the big promotion. The problem wasn’t that they could not add capacity to their systems. They could. They were using a non-IBM database system by the way. The problem was that adding database nodes to their non-IBM database cluster is a non-trivial project. And then, when you consider all the factors, it was questionable whether it made sense to remove those database nodes after the promotion. Well, this airline are now quite excited about DB2 pureScale’s ability to easily add and remove capacity. This is made possible because DB2 pureScale does not require that your applications are cluster-aware and does not impose best practices of partitioning your data across the nodes in the cluster. DB2 pureScale is truely transparent to applications. When this airline needs additional capacity, they simply add one or more logical partitions (LPARs) of DB2 pureScale to handle the additional capacity, and then remove them when they are no longer needed. They pay for the additional DB2 capacity only for the duration of the promotion.
- We are currently engaged with several large retailers. These retailers typically engage in capacity planning projects every six months or so. As part of this exercise, they forecast their needs for the subsequent two or three years. These retailers determine their peak workloads, add a cushion, and then provision the necessary hardware, software, and storage. The thing is, given the nature of the retail business, much of the capacity they provision is unused for most of the year. These retailers buy a lot of hardware, and license a lot of software for that hardware, just to handle peak workloads during “busy periods”. They have to pay for all of this capacity even when they do not use it during the “normal periods”. Because DB2 pureScale has daily-based pricing and because it is so easy to add and remove capacity, many retailers can now provision the software on-demand and only pay software license fees for the capacity they actually use. (Note that this lowers software costs, not hardware costs.) These companies are forecasting that using DB2 pureScale to add and remove capacity on-demand will free up significant amounts of IT budget.
- A large insurance company is talking to IBM about being able to handle large volumes of transactions at short notice. For instance, the insurance company needs to be able to process a high number of transactions after a particularly damaging hurricane or tornado. However, they cannot accurately predict the severity or timing of these natural events. As such, their approach has been to provision for the worst case scenario. But doing this has resulted in a large amount of their IT budget being tied up in servers and software that is not being used most of the time. Now DB2 pureScale allows them to recapture a significant amount of the IT budget that is spent on database licensing and maintenance and invest it in supporting the business in new and innovative ways that help them get ahead of the competition.
- It seems obvious now, but I didn’t realize that telecommunications companies encounter large spikes in transactional workload during holidays. All those calls to family and friends generate a lot of transactions on the back end. It should not surprise you that DB2 pureScale’s ability to add or remove capacity on-demand is generating a lot of interest from telecoms providers. Of course, the continuous availability enhancements in DB2 pureScale are also very important for telecoms providers. Again, the primary benefit for these companies is the cost savings involved in not paying for extra capacity when you are not actively using that capacity.
DB2 High Availability Licensing Explained…
There’s another great article from Paul Zikopoulos that you should know about… it explains in plain English all aspects of licensing DB2 for Linux, Unix, and Windows in high availability configurations. This is the perfect article if you are implementing a high availability configuration and don’t have time to read through the announcement letters, licensing sheets, PLETs, and so on. You can read the article at Licensing distributed DB2 9.7 servers in a high availability (HA) environment

