Working with data - databases.
|On this page:||Other pages:|
* olap oltp
* old, mainframe
|Intorduction||home - top of the page - email|
Database usually is a collection of tables (and tables, and tables, and tables) - which constitute "relational" database (relationships between tables). There are also alternative approaches, for example objectdesign.com uses objects instead of tables.
- databases on Google.
* directory.google.com/Top/Computers/Programming/Databases/ - databases on Google.
* searchdatabase.com - portal on DB technologes
* searchdatabase.techtarget.com/searchDatabase_Editors_Picks_Page/0,281922,,00.html - some good articles
Most databases are relationaland and understand standard language - SQL (Structured Query Language). SQL queries let you select, insert, update, find out the location of data, and so forth. See more at sql.html.
Here is the timeline (shows only some databases)
1970 - Dr. E.F. Codd formulated concept of a relational database. ( explained here)
1976 - IBM developed a prototype relational database management system and language (System/R).
1977 - Larry Ellison founded Oracle.
1979 - Oracle Corporation introduced the first commercial SQL database (written in assembler for PDP-11). Teradata is founded.
1983 - IBM released the first version of DB2 (for mainframe - MVS).
1987 - Sybase releases its first commercial DBMS product ("DataServer") for Sun workstations (unix).
Sybase partner with Microsoft (divide the market - MS - on Windows & OS/2, Sybase - on Unix & VMS).
Microsoft also parners with Ashton-Tate (DBase).
1989 - Microsoft SQL Server Version 1.0 ships
1991 - NCR (AT&T) acquires Teradata
1993 - Microsoft and Sybase end their partnership
1995 - Sybase - SQL Server 11. Sybase merges with PowerSoft (PowerBuilder).
Microsoft SQL Server 6.0. DB2 for OS/2 and for AIX
1996 - Sybase Adaptive Server Enterprise 11.5, Microsoft SQL Server 6.5.
1998 - Sybase v.11.9.2. Oracle 8. Microsoft SQL 7.
2000 - Sybase 12. Oracle 8i. Microsoft 2000. DB2 ver.7.
2001 - Oracle introduces RAC (Real Application Clusters)
2007 - Teradata is separated form NCR
2008 - Sybase 15. Oracle 11. Microsoft SQL Server 2008. DB2 ver. 9.
As of today (January 2008) Sybase is still a very common database on Wall Street. Reasonable question may be why Sybase became so common, when Oracle and IBM had their products on the market much earlier ?". My guess is that Sybase had Client-Server architecture on Unix at the time when financial companies were starting to use Unix.
|Client-Server||home - top of the page - email|
Database "server" can communicate with many "client" applications (2 layer Client-Server).
There can be also some middle layer(s) - very common is a 3-tier
- Front-end (text, forms, etc.) - browser, Java application, MS Excel, etc.
- Middle - programms connecting the front-end with the back-end.
- Back-end - data (database, files).
Fat client - lots of application code and some data are
stored on the client side
Thin client - code and data are not stored on the client. They are loaded fromthe server - and run on the client (browser).
There can be many layers (more than 3). Application can be distributed between different locations, use different databases, different clients, and different servers communicating with those databases, clients, and other servers.
In distributed environment of today's large financial institutions, people often use a messaging system (like MQSeries from IBM) and code messsages in XML. Many claim that messaging is much more convenient to use than CORBA or Orbix.
To communicate with the database from your application, you need drivers.
Each database has each own drivers for different languages. Often
it is a good idea to use standard drivers:
* ODBC (Open Database Connectivity) - standard and open API for database access (Windows).
* JDBC (Java Database Connectivity) - API for database access from Java
|OLTP & OLAP||home - top of the page - email|
OLTP - Online Transaction
- used to update database, typically high volume, response time is important
OLAP - Online Analytical Processing (also called DSS - Decision Support Systems, EIS - Enterprise Information Systems)
- used to help review and analyse information in the database. Typically doesn't update database. Queries may be complicated and long.
Another name of this system type is DSS (Decision Support System). Related terms - "data mining" and "data warehouse" - check their exact meaning on whatis.com.
- Data Warehousing for Cavemen
* perso.wanadoo.fr/bernard.lupin/english/ - Bernard Lupin's OLAP site/tutorial
* www.olapreport.com/ - The OLAP Report
* www.datamining.com/bridge.htm - OLAP and DataMining: Bridging the Gap
* www.olapcouncil.org/ - OLAP Council at 888-818-OLAP
* msdn.microsoft.com/vfoxpro/technical/olap.asp - Microsoft SQL Server OLAP Services 7.0 OLAP Manager Tutorial
* www.planetit.com/techcenters/docs/database/opinion/PIT20000612S0001 - How CRM Differs From OLAP
Since ~2005 - new generation of databases for data analytics:
New grid analytic databases
Cells of data in a table - [row_id,column_id,value].
REPLICATION (Replication Server)
- software technology to copy data from one database (primary server) to
another (secondary) in more or less real time. Sybase has a very good replication
server. There are 2 sides to replication process: to send information from
the primary server (this takes ~ 10% of server load) - and to catch and
update on the secondary server (this may put significant load on the secondary
server, because you have to do inserts and deletes).
|Different databases - and how to use them||home - top of the page - email|
- all databases
* www.arsdigita.com/asj/ - several excellent books and a lot of discussions
* www.pcslink.com/~ej/dbweb.html - Database Jump Site
- DB2 universal database (Unix, NT, Mainframe, ...) - read on db2.html
* www.oracle.com - very popular database - see oracle.html page
* www.sybase.com - see sybase.html page.
* www.microsoft.com - for MS SQL server (also Exchange and Access) - see ms_sql_server.html page.
( - msdn.microsoft.com/vfoxpro/ - Microsoft Visual Fox Pro )
* www.teradata.com - Teradata - datawarehouse engine.
* www.objectdesign.com - object database (C++ or Java programming)
= www.mysql.org/ - (see also mysql.html)
good, fast, support regular expression search, doesn't support transactions
* www.postgresql.org/ - free database with transactions (a part of Linux distribution)
* www.illustra.com - Illustra database
* licensetospeed.com/homepage.html - InterSystems Corp. - Cache - good, and very fast
* www.solidtech.com/ - good, but slow
* www.unify.com/ - good sql database
Some older technologies:
* www.netlib.com/rbffaq01.htm - read about Clipper and Btrieve
* pervasive.sw.com.sg/products/ - "Pervasive.SQL" - developed from "Btrieve" and "Scalable"
|Tools||home - top of the page - email|
- PLATINUM ERwin from Computer Assotiates - a database design tool.
* www.embarcadero.com/ - Embarcadero - creators of DBArtisan and RapidSQL. Excellent tools !!
( read reviews on this site: http://www.mssqlserver.com/ - DBArtisan and RapidSQL )
Work with all modern databases, allow to transparently copy objects between databases of different types or versions.
http://www.aquafold.com/downloads.html - Aqua Studio - supports multiple databases (Oracle, DB2, Sybase, MS SQL, MySQL, Informix, PostgreSQL, etc.
Short recommendatin: - for all major databases (DB2, Oracle, Sybase, MS SQL Server) - use DBArtisan for administration and SQL Navigator or RapidSQL for development. For MySQL you can use their own command-line client or GUI.
See more offers on pages related to particular databases: db2
- sybase - oracle -
|Old, Mainframe||home - top of the page - email|
Some old MainFrame technologies:
* www.cai.com/solutions/ - Computer Associates International (bought Clipper, IDMS, Platinum)
* IDMS - www.cai.com/products/idms_dds.htm - hierarchical database (non-relational)
* ADABASE (sets of tables)
* IMS - hierarchical database (non-relational) - http://www-306.ibm.com/software/data/ims/ - used by 95% of Fortune 1000 companies. Serves 200 million end users, managing over 15 million gigabytes of production data. IMS processes over 50 billion transactions every day. IMS still owns the high-volume online transaction and database management environment. IMS comes with JDBC connectors, Java and XML support, gui administration front end, fastpathing, checkpoint recovery. Why would IBM continue releasing new versions of IMS when they own DB2? Because IMS is the fastest at transaction processing. They recently achieved the following: IMS Runs Over 21,000 Transactions per Second on a Sysplex (Nearly Two Billion Transactions per Day). This IMS fact sheet gives some highlights: http://www-306.ibm.com/software/data/ims/v9/v9fact.pdf
* ISAM - Indexed Sequential Access Method (a file access method supporting both sequential and indexed access).
* VSAM - Virtual Storage Access Method - indexed files - check on www.whatis.com
* DBase - small database for personal computers, produced by Ashton-Tate in 1979 (1984 - DOS version, 1991 - sold to Borland, 1993,95 - DBase for windows shipped by Borland).
Some old languages used with old databases:
* JCL - Job Control Language
* COBOL-2 - Common Business-Oriented Language
* CICS - Customer Information Control System - adds interactivity to COBOL, works with VSAM files, or with databases (DB-2, IMS, IDMS)