Monday, March 31, 2008

mondora releases the Oblique Scaling™

Is there a missing piece between Virtualization, Management, and Application stack? Yes. Applications speak languages not understandable from Virtualizaton platforms resulting fully unmanageable.


Applications cannot rise to the world they're suffering and everything is managed only when system is broken. The only things IT managers can do is restarting machines, switching on cpu and other things. A new approach is to monitor the system trying to anticipate suffering by monitoring CPU and other stuff.

This approach is making some Sense in Operational Management but does not making Sense in terms of business, because it tries to predict on some hardware collected data against what are real business needs.

This approach is a bottom up approach where Hardware drives Software and then Business. Hardware is like Software a vehicle for helping business and not a driver.

Imagine if a hardware producer wants to sell "normal power" and "extra power" in a startup company. "Normal power" is the basic expectation in terms of business: what will be really the next days Business; "extra power" is something that is unpredictable and everyone look for it (a big success on a campaign etc.).

This is the way! The preferred approach is to let Business to drive how the application is suffering and then choose the right policy to make it profitable.
This approach is preferred in term of Return Of Investment where the Application and Hardware are considered as a unique infrastructure that is adapting to business needs and not conversely.

Sense is a platform that is able to describe how a system is performing during time and to switch over Application nodes by anticipating possible faults. During time, Horizontal Scaling cannot be enough; the Application needs more Power by Vertical Scaling, and usually we have available "time slot" of cpu times.

In mondora, Horizontal and Vertical Scaling is handled as an Application Feature crosscutting the business behaviors. Like High Available Services available in the System Federation, of a SOA System, with different Protocol, and different Service Level Agreement configuration, mondora enable your application to perceive Hardware as an external service itself. Considering it as a High Available Service Hardware can be instrumented with a Service Level Agreement specification and self tuned, meeting business best value, while business is computing.

Sense implements Several levels of Service Statuses; this let the system to predict when is overloading and choose which is the best strategy to let the system survive.

This is done prior that a fault occurs letting the application to feel if things are going badly in terms of business.

Different strategies can be implemented when choosing, from the business point of view, how to scale.

Business is driving and mondora has produced several implementations.

For example:
- Worst Performance, Increase Power
- Latency vs Operational Time

All the above strategies are implemented and deployed -plug-ins at runtime- as Scent that enables to switch on (and off) only the needed power to let the application run meeting SLA.

This is like the common Virtualization layer, but adds more: add rule from the business to address saving; every type of saving, from energy power saving to operational costs saving. Requirements to get saving shall have hardware that enable powering on/off parts, without switching buttons. Sense will drive your saving, to meet declared SLA.

Traditional Virtualization is under the hood and is perceived by the Application, this let Application that makes Sense to Vertical Scale on several platforms.

for further information about oblique scaling contact sense@mondora.com

Sunday, March 23, 2008

Openldap and HP UX: industry strength solution

Enterprise computing requires industry strength behavior that can survive across a big amount of data and an increasing overload.

The Open source is sometimes not appreciated because it is thought as "not ready" for the Enterprise Era.  
In these days, I and a good fellow Paolo are testing Openldap for performances in an Enterprise Architecture.
Stress Task goals are given with a short anticipation, last Thursday, and results must be produced in a short time: next Monday at 12.00!
I love to approach everything having those objectives:
- learn something new;
- produce something valuable;
- enjoy meanwhile, stressed.

Starting from the "learning something new," what's better if you have two HP machine partitioned on four VPAR having on each machine 64GB of memory and 300GB of disk. Ok HP UX is not so easy to be understood as linux is, but having Bash on it, it becomes easier.

I like doing a job in pair, because:
- knowledge transfer is facilitated;
- stress is propagated.

Our client configuration for processing huge LDIF file (millions of entries) is a Mac Pro 8core, and our client machines to work with are our Mac Book Pro; a 17" and a 15."
To better see what's going on, an Apple 30" Cinema Display is connected to the 17" MacBook and to have each client interface on each client interface we:
- connected an external keyboard and mouse, through the Cinema Display, having a user to work together on the same computer;
- shared thought VNC the 15" macbook, that is the client for doing documents, writing wikis browsing the net to find useful ldap PATCHES.

In terms of software, after a quick look over the net, we've chosen SLAMD as our helper in doing stress tests.
SLAMD is a Sun iPlanet opensource implementation thought mainly to do Jobs on an LDAP server. It is a Web Application, written in Java, and runs in a Web Container like Tomcat. We configured SLAMD on a virtual partition near the master ldap server.
SLAMD is a good tool for doing Stress tests with much native feature that, in our case, are covering all the test cases. We've chosen this tool because time to market of our stress test report is really short. In three days we have to configure the platform, run differently kind of tests and produce hi quality report. And SLAMD is a tool that gains lots of our requirements:

1. It is strongly LDAP oriented
2. It is fully web administrable and configurable
3. It organizes distributed client over the net in the Grinder fashion, with the Idea of Distributed Computing Agents
4. Jobs can be scheduled during time with simple clicks
5. It implements a report engine that exports pdf, html, text report having graph and statistical information collected
6. It's easy to install, just untar and launch (in the tomcat edition)


During stress testing, system has been monitored having Glance running on each single VPAR, monitoring CPU’s performances. HP GLANCE fulfills the need for a simple to use, simple to understand performance product that examines what is happening now on a system. It displays the usage of the three critical system resources (CPU, Disc, and Memory) from a system wide point of view and then highlights programs which are being a run that is of special interest. HP GLANCE allows you to isolate a particular job, session
or process for additional detail, if desired.


Testing approach

Tests were conducted to observe how system works under different load, measuring the average response time, peaks and other useful information.
Since the first test the above architecture, having Master, and Slave connected is respected. The main statement in doing tests’ run is that: it doesn’t matter if a test relies only to a component inside the architecture; it shouldn’t be isolated while stressing atomically it. This to guarantee and to monitor not only the single component but to watch a single component in a complex system.
Variables during testing are a multitude of information and span across:
• time of testing;
• number of operations;
• when and where execute operations;
• level of concurrency;
• level of randomness;

Variables during testing shift with these criteria:

VariableStarting fromMoving toComments
Users1 million2 millionsfirst system setup
Users2 million6 millionsstress testing
Stress Period10 minutes60 minutesinitial load
Stress Period60 minutes180 minuteswork load
Stress Period180 minutes720 minutesendurance test
Thread per client15Endurance stress test


Testing has been approached thinking concurrent client with a factor level for update, delete, search and add.

The concept of User is a set of complete Entries from the inetorgperson.schema.

1st benchmark test: 1 million user - 3 concurrent slave

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.


ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory4 GB
Customer Base1 million
Replication mechanismslurpd
Test time10 minutes



Test has produced:
average search response 0,6 msec

Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:1.7gb/1.9gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS:3.3gb/4.0gb
Other RSS/VSS:9.5mb/152mb





Test Considerations
Overall has been done 11.786 ADD on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully. During Slave’s synchronization, 3 different jobs were stressing those 3 slaves reading and having a performance of 0,6msec each transaction, doing Random Search.


2nd Benchmark test: 2 million user - 3 concurrent slave

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory4 GB
Customer Base2 million
Replication mechanismslurpd
Test time10 minutes



Test has produced:
average search response 0,592 msec

Having Shared Memory:

Text RSS/VSS:2.6mb/3.8mb
Data RSS/VSS:2.0gb/2.2gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS:2.7gb/4.0gb Other RSS/VSS:9.2mb/111mb






Test Considerations

Overall has been done 2.850 ADD, 3084 DELETE, 5875 UPDATE on the master node, and has been verified that the propagation on the 3 slaves through Slurpd daemon went successfully. Meanwhile The Master replicated to slaves, 3 different jobs were stressing 3 slaves reading having a performance of 0,592 msec each transaction, doing Random Search.





3rd Benchmark test: 6 million user - 3 concurrent slave - 10 mins run 1:2:1

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time10 minutes



Test has produced:
average search response 0,485 msec

Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.6gb/6.6gb
Stack RSS/VSS: 32kb/ 32kb






Test Considerations

Overall has been done 206.614 ADD, 224.583 DELETE, 427.186 UPDATE on the master node, and has been verified that the propagation on the 3 slaves through Slurpd daemon went successfully. Meanwhile the Master replicated to slaves, 3 different jobs were stressing those 3 slaves reading and having a performance of 0,485 msec each transaction, doing Random Search.





4th Benchmark test: 6 milion user - 3 concurrent slave - 10 mins run 10:1:1:1

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time10 minutes



Test has produced:
average search response 0,488 msec

Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.6gb/6.6gb
Stack RSS/VSS: 32kb/ 32kb






Test Considerations

Overall has been done 3.975 ADD, 3.977 DELETE, 3.817 UPDATE on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully. Meanwhile The Master replicated to slaves, 3 different jobs were stressing those 3 slaves reading and having a performance of 0,488 msec each transaction, doing Random Search.





5th Benchmark test: 6 million user - Random search

This test has been conducted to check the whole infrastructure having a Slave under read test on a random cluster.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time10 minutes
Random Cluster Size
1 million



Test has produced:
average search response 0,436 msec

Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.6gb/6.6gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 10mb/160mb




Test Considerations
Overall has been done 1.357.628 on the slave node against a cluster of a 1.000.000 random users.


6th Benchmark test: 6 million user - Update - Modify - Delete - 600msec

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time10 minutes



Test has produced:
transaction per second: update: 236,4
transaction per second: add: 236,877
transaction per second: delete: 236,4


Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.6gb/6.6gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 10mb/160mb







Test Considerations

Overall has been done 141.081 ADD, 141.160 DELETE, 141.021 UPDATE on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully.







7th Benchmark test: 6 million user - 10:1:2:1- 12 hours

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves3 slaves
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time12 hours



Test has produced:
transaction per second: search 1984,697
transaction per second: update 9,890
transaction per second: add 4,991
transaction per second: delete 4,991


Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.7gb/6.8gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 11mb/168mb







Test Considerations

Overall has been done: Search: 67.070.660, Modify: 427.186 (49.766%); Delete: 224.583 (26.163%); Add: 206.614 (24.070%) on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully.






8th Benchmark test: 6 million user - 10:1:1:1- 1 hours with concurrency only on master

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves1 master
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time1 hour



Test has produced:
transaction per second: search 969,444
transaction per second: update 96,760
transaction per second: add 97,077
transaction per second: delete 96,967


Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.7gb/6.8gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 11mb/168mb








Test Considerations

Overall has been done: Search: 67.070.660, Modify: 427.186 (49.766%); Delete: 224.583 (26.163%); Add: 206.614 (24.070%) on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully.







9th Benchmark test: 6 million user - 10:2:2:2- 1hours with concurrency only on master

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves1 master
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time1 hour



Test has produced:
transaction per second: search 678,123
transaction per second: update 135,643
transaction per second: add 135,703
transaction per second: delete 134,770


Having Shared Memory:


Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.7gb/6.8gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 11mb/168mb


Test Considerations

Overall has been done: Search: 67.070.660, Modify: 427.186 (49.766%); Delete: 224.583 (26.163%); Add: 206.614 (24.070%) on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully.







10th Benchmark test: 6 million user - Random search

This test has been conducted to check the whole infrastructure having a Master under write stress and slave to read the just replicated information.



ConfigurationValues
writes per seond20 tx/sec
Number of slaves1 master
Shared memory16 GB
Customer Base6 million
Replication mechanismslurpd
Test time1 hour



Test has produced:
average search response: 1.686 msec

Having Shared Memory:

Text RSS/VSS:2.5mb/3.8mb
Data RSS/VSS:5.6gb/6.6gb
Stack RSS/VSS: 32kb/ 32kb
Shmem RSS/VSS: 11gb/ 16gb
Other RSS/VSS: 10mb/160mb





Test Considerations

Overall has been done 17.976.344 on the slave node against a cluster of a 1.000.000 random users.
Delete: 224.583 (26.163%); Add: 206.614 (24.070%) on the master node, and has been verified that the propagation on the 3 slaves throught Slurpd daemon went successfully.



Conclusion

Openldap has performed really well on the HP UX Cluster environment; during all the test phase no restart of Slapd has been required and no restart of Slurpd has been done.

Openldap and HP UX, has been used AS IS and is not tuned for better performing. Openldap has been patched with the Official patches available online.

VPAR configuration has been tested while Openldap was running having Core hot enabled during run time.