UniverseUniversity


Home Projects Jobs Clientele Contact

uu


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UU: User state



I have a perfect chance now to observe the strategy "multiple app-servers + single DB server" in action.
The place is a bank here, in AU, and I'm looking at this from inside.
They have about 20 application servers and single 8-ways DB server. The primary problems:

1) Performance. They came to the point when adding more app servers slows down the database
2) Code distribution. A fix in the database in many cases requires C++ code re-distribution, and updated parts of the system have to be offline.
3) Code manageability. They have tons of C++ code (250+ CGI programs) that all have little bells and vistles. If something doesn't work in such app - they debug it, recompile it, ship it. It has to be tested throughly. A part of the problem is - sometimes it isn't clear what the code is doing. The support is constantly busy. Overall - it is a mess.

The most important conclusions so far: by adding a lot of application servers, we move the bottle neck (only partially!) from data processing to data exchange, but we keep the same single DB server as a bottleneck. That schema doesn't give good performance.

2006/9/14, Ilya A. Volynets-Evenbakh <ilya@total-knowledge.com>:
I will agree with general approach. At least with going in that direction.
Let's look at few specific future requirements though.

1. Horizontal scalability. I.E. we run out of power, we add more servers.
2. Database and application servers on different machines - result -
potentially need to be able to cache object data in application server for
efficiency reasons.

Let's take a look at #1
In case we need to distribute application due to lack of power, we will need
to have multiple machines for most power-hungry part. In your case it's
database.
How easy is it to do with our database server? How efficient is it? What
other constraints
exist?
In all-C++ logic case, distributing will be fairly easy (almost perfect
load balancing
can be implemented in cppserv front-end web server with ~30 lines of code)
Issues: data consistency, data coherency. Network efficiency might still
be an issue
in this case as well.

#2. This is really unclear thing. Let's see what could be achieved by
caching objects
in app server memory:
On one hand, caching objects in memory might seem to save data-retrieval
roundtrips.
However, in reality, when object is requested, it is often requested in
order to modify.
Now, we still have to have some sort of a handle on an object in order
to modify it, so
caching _migh_ save us roundtrips. Meh... Needs more thought.

Anyways - #1 seems like important aspect, that needs to be addressed.
How does
your approach fit with it?


Alexey Parshin wrote:
> C++ module shouldn't have it's own copy of data, especially - data
> relations. Instead, it assumes this to be implemented in DB. Methods
> in C++ classes, in that case, simply call stored procedures.
> For instance, we need to add a relation between a student and a
> problem. We should have a stored proc like student_assign_problem that
> does the job and reports the result. Another stored proc should report
> the list of problems assign to student.
>
> Pros:
> - the business logic is concentrated in one place
> - maximum possible performance, if we are not doing high-level math
> - data processing in SQL is pretty simple
> - bugs in stored procs are easier to fix (we just replace a proc in
> real time)
> - in general case (didn't check enough with Postgres yet), SQL server
> controls the DML in stored procs, preventing some runtime errors
> Cons:
> - SQL languages we can use require C-style programming
> - debugging stored procs is more difficult than C/C++ code
>
> The opposite implementation, when we have the logic implemented in
> C++, basically treats SQL server as dBase, having a data copy on the
> client and doing most of the processing there. That approach is
> acceptable when a few people are using a database. With the grow of
> user number, it leads to the slow down of the database.
> Pros:
> - the code may be taken to pretty high abstraction level,
> encapsulating everything inside the classes
> - easy to debug data processing
> - the language (C/C++) is a sweety
> Cons:
> - processing data involves a data copy on the client, therefore a
> round trip read-change-write - creates an extra steps in processing
> (performance issues)
> - data modifications can be made from different places. Often (not
> necessary true for us), people untie the database security and
> integrity constraints to have more possibilities for data modification
> - leads to broken data integrity and poor performance
> - fixing bugs in logic requires recompilation the C/C++ program and
> shipping it to customer. That is more difficult than patching a stored
> proc
>
> There are some pros and cons I'm probably missing.
>
> It's possible to use a combined approach. In my experience, it
> combines the problems of both prior approaches and advantages of none :(
--
Ilya A. Volynets-Evenbakh
Total Knowledge. CTO
http://www.total-knowledge.com




--
Alexey Parshin,
http://www.sptk.net

Authoright © Total Knowledge: 2001-2008