Rows
and Columns
A database makes it easy to find specific bits of information. It does this by
organizing its contents into columns and rows (also known as fields and records).
A column is an area reserved to hold a specific type of data. A row is a group
that contains one piece of data for every column in the database.
|
Column |
Column |
Column |
Column |
Row |
|
|
|
|
Row |
|
|
|
|
Row |
|
|
|
|
Row |
|
|
|
|
Row |
|
|
|
|
For example, if you wanted to create a database of names and addresses, you
would probably have a column for first name, another column for last name, a
column for telephone number, a column for postal code, and so forth. You would
create a row for each person in the database, and each column of that row would
contain the appropriate piece of information about that person.
|
Last
Name |
Street
Address |
City |
State |
Invoice
No. 970-09 |
Smith |
879
Maple Ave. |
New
York |
NY |
Invoice
No. 849-98 |
Jones |
8754
Main Street |
Chicago |
IL |
Invoice
No. 833-90 |
Johnson |
976
Oak Blvd. |
Los
Angeles |
CA |
Invoice
No. 455-98 |
Rodriguez |
347
Elm Lane |
San
Francisco |
CA |
Invoice
No. 332-09 |
Ramirez |
4215
Sycamore Way |
Boston |
MA |
Why organize information this way? Because it makes it faster and easier for database
software to work with that information. Here are just a few of the advantages
of column/row organization:
Finding: If each address in the name-and-address database were a single,
all-in-one piece of information, how would you find a group of addresses with
the same postal code? When addresses are stored in separate columns, it's much
easier and faster for database software to show all the rows in which the postal
code matches the one you want.
Sorting: Because each piece of information has its own place in the database,
the database software can easily sort a group of rows by any type of information.
For example, having first and last names in separate columns makes it easy and
fast for the database software to sort the rows by first name or last name.
Formatting: Each column in a database is set up to hold a particular kind
of information in a particular format. For example, a postal code column might
be set up to accept only a five-digit number. If you try to enter the wrong type
of information into a column, you will most likely get an error message. This
can help you to detect incorrectly entered information (such as postal codes that
are missing a digit or contain the letter O rather than a zero).
Don't confuse database with database tool. A database
is simply a carefully organized grouping of data in a predetermined format; a
database tool is software you use to access that data. You might use a variety
of database tools to access and update the data in a database.
Relational Databases
A relational database is a special kind of database designed to save space and
cut down on errors by avoiding the duplication of data. It consists of a number
of linked tables, each of which contains its own set of columns and rows.
To illustrate why you might want a relational database, lets say you want
a database that keeps track of all the customer invoices your business creates.
You could create a database containing one row for each invoice you write, with
columns for the invoice number, the date, the invoice amount, and the customers
name, address, and phone number.
Such a database might serve your purposes just fine. However, there are a few
problems you might run into after a while:
Space: Invoice numbers may not take up very much space in a database, but
customer names and addresses do. And since you probably have repeat customers,
youll probably have many copies of such customers names and addresses,
needlessly taking up lots of space in your invoice database.
Duplication: What if you want to mail a flyer to everyone in your database?
If you simply print out a mailing label for every invoice entry in your database,
youll end up sending more than one copy to any customers who have used your
business more than once.
Accuracy: People make mistakes. What if your data entry person misspells
a customers name in one row, but not in another? What if they spell out
Street instead of abbreviating it to St.? Youll
end up with two different names or addresses for the same person, and that makes
it harder to weed out duplicates.
Updating: What happens if one of your customers moves and sends you a change-of-address
form? Youll have to update every occurrence of that customers address
in your invoice database.
A relational database solves all of these problems by storing your invoice information
in one table and your customer information in another. Instead of reproducing
all the customer information for each invoice, each row in the invoice table simply
contains a short code that refers to a particular row in the customer table. Because
each customers name and address is stored only once - in the customer table
- you dont have to worry about any of the problems listed above.
The tables in a relational database refer to one another using codes stored in
columns called keys. There are two types of keys:
Primary key: Each table has a column that contains a different value for
each row in the table. This column is called that tables primary key. You
can immediately find any single row in a table if you know its primary key value,
because there will be only one row in the table for which the primary key column
contains that value. In our example, the invoice number might be the primary key
for the invoice table, because each row in the invoice table has a unique invoice
number.
Foreign key: When a row in a table refers to a single record in another
table - for example, when an invoice refers to a row in your customer table -
it uses the referred-to tables primary key value. In the referring table,
the value is called a foreign key, because its the key to a single row in
a different table.
Metadata
Databases can also be organized to store information about the data they contain.
For example, you might want to add a column to your name and address database
that tells when a given row was entered, and another that tells when it was last
modified. This kind of information is called metadata, or data about data.
Keywords
What if you want to add a label column to your name and address database
so you can record whether each person is a close friend, an acquaintance, or a
business associate?
This introduces another kind of information to your database, usually referred
to as keyword information. Having a keyword field usually means that
when you add a row of information to your database, you choose from a predefined
series of keywords that tell something about that row (for example, Friend,
Acquaintance, or Associate). Having a keyword field like
this one might, for example, help you quickly find or sort your names and addresses
based on your relationship with the corresponding people.
Queries
In order to find data in a database, you must find that data with a query. Generally
speaking, a query (or search) is a request for all data that matches
a specific set of search criteria.
A query can be constructed in several different ways:
· By choosing criteria from a list or menu: This is probably the
easiest of the three options, but it is also the most restrictive; your queries
are limited because the available criteria are predefined.
· Using query by example (QBE): This method lets you enter values
into fields in a pre-defined search form. It provides more flexibility than the
first method, but you may be restricted by the number and choice of fields included
in the search form.
· Using a query language such as SQL (Structured Query Language) or
ODBC (Open Database Connectivity): This method, while being the most powerful,
is the most complicated because it requires knowledge of a specific set of commands
and rules.
Transactions
Many modern databases today are designed to be used in mission-critical situations
by a large number of people. This heavy usage, in an environment where accuracy
is vital, can cause potential problems.
For example, consider a bank that stores all of its account information in a database.
Say a user requests a transfer of $500 from her checking account to her savings
account. What if the ATM sends the request to subtract $500 from the checking
account, but then a power outage prevents it from sending the request to increase
the balance of the savings account by the same amount?
The concept of transactions was invented to solve these type of problems. To solve
the problem in this example, the database server application groups both requests
into a single unit (a transaction), ensuring that either both of them are performed
or neither of them is performed. So if something prevents the balance of the savings
account from being increased by $500, the request for the checking account to
be decremented by $500 is rolled back, or cancelled.
What if two requests to withdraw $1000 from an account are received simultaneously?
If the accounts balance is $1500, only one of these requests should be honored.
To solve this problem, database server software makes sure that transactions involving
a particular chunk of data are processed one at a time, even if they are submitted
simultaneously. In this case, one of the two withdrawal requests would be rolled
back, even if the previous query had shown that the balance was high enough to
allow the withdrawal.
· Atomic means that either all of the requests in a transaction
must be performed or the entire transaction must be rolled back.
· Consistent means that the database must adhere to a set of correctness
constraints (in the same way that a balance sheet, for example, must balance),
both before and after every transaction.
· Isolated means that two transactions involving the same data must
be processed sequentially, even if they are submitted simultaneously.
· Durable means that the database must be restorable even if a disaster
such as a hard disk crash occurs.
TrueStep did not create this white paper. Courtesy of Oracle Corporation.