A—Databases

A.1 Basic concepts (5 hours)

A.1.1 Outline the differences between data
and information.
2 Data is meaningless. To be useful, data must be interpreted to produce information.


A.1.2 Outline the differences between an information system and a database.
2 Students must be aware that these terms are not synonymous.
Databases are a component within an information system.

A.1.3 Discuss the need for databases.

This should address topics such as the benefits of data sharing.


S/E For example, correct information relating to customers and/or clients.


A.1.4 Describe the use of transactions, states and updates to maintain data consistency (and integrity).
2 For example, to ensure data
consistency when moving money
between two accounts it is necessary
to complete two operations
(debiting one account and crediting
the other). Unless both operations
are carried out successfully, the
transaction will be rolled back.
S/E For example, ensuring correct
information relating to customers
and/or clients.

A.1.5 Define the term database transaction. 1
A.1.6 Explain concurrency in a data sharing
situation.
3
A.1.7 Explain the importance of the ACID
properties of a database transaction.
3 ACID refers to:

atomicity

consistency

isolation

durability.
A.1.8 Describe the two functions databases
require to be performed on them.
2 Query functions and update
functions.
A.1.9 Explain the role of data validation and
data verification.
3

A.2 The relational database model (15 hours)
Assessment statement Obj Teacher's notes
A.2.1 Define the terms: database
management system (DBMS) and
relational database management
system (RDBMS).
1
A.2.2 Outline the functions and tools of a
DBMS.
2 A range of management functions
and tools should be appreciated
focusing on the creation,
manipulation and interrogation of a
database.
A.2.3 Describe how a DBMS can be used to
promote data security.
2 Features involving data validation,
access rights and data locking.
A.2.4 Define the term schema. 1
A.2.5 Identify the characteristics of the three
levels of the schema: conceptual,
logical, physical.

https://www.1keydata.com/datawarehousing/data-modeling-levels.html


2
A.2.6 Outline the nature of the data
dictionary.
2
A.2.7 Explain the importance of a data
definition language in implementing a
data model.
3
A.2.8 Explain the importance of data
modelling in the design of a database.
3
A. 2.9 Define the following database terms:
table, record, field, primary key,
secondary key, foreign key, candidate
key, composite primary key, join.
1 These are the accepted terms.
Table is equivalent to relation/file.
Record is equivalent to tuple/row.
Field is equivalent to attribute/
column.
Only knowledge of an inner join is
required.
A.2.10 Identify the different types of
relationships within databases: one-toone,
one-to-many, many-to-many.
2 AIM 4 Demonstrate initiative in
applying thinking skills critically
to understand the relationship
between entities in a specified
situation.
LINK Thinking abstractly.
A.2.11 Outline the issues caused by
redundant data.
2 S/E, AIM 8 Issues relating to the
integrity and reliability of data.
A.2.12 Outline the importance of referential
integrity in a normalized database.
2 S/E, AIM 8 Issues relating to the
integrity and reliability of data.
A 2.13 Describe the differences between 1st
Normal Form (1NF), 2nd Normal Form
(2NF) and 3rd Normal Form (3NF).
2 For example:

1NF has no repeating rows or
columns.

2NF is based on full functional
dependency.

3NF involves the removal of
transitive dependencies.
A.2.14 Describe the characteristics of a
normalized database.
2 Students will need to understand
the characteristics of a database
normalized to 3NF.
A.2.15 Evaluate the appropriateness of the
different data types.
3 Students will be expected to be able
to justify the selection of a particular
data type in a specified situation. For
example, integer or floating point.
S/E, AIM 8 The question of privacy
for stakeholders.
S/E, AIM 8 The end-user must be
seen as a key stakeholder when
planning a new system.
Comparing the different needs of
each stakeholder.
Who is a relevant stakeholder?
A.2.16 Construct an entity-relationship
diagram (ERD) for a given scenario.
3 Students will be expected to
construct entity-relationship
diagrams in 3NF for a relational
database.
AIM 4 Demonstrate skills enabling
an understanding of the relationship
between entities in a specified
situation.
LINK Thinking abstractly.
MYP Technology: databases.
A.2.17 Construct a relational database to 3NF
using objects such as tables, queries,
forms, reports and macros.
3 Students will be expected to
demonstrate knowledge of database
designs in the SL/HL paper 2
that have resulted from practical
activities.
TOK Utilitarianism, the greatest
good for the greatest number. The
ends justify the means.
AIM 4, AIM 6 Demonstrate initiative
in applying thinking and problem-
solving skills critically to understand
the relationship between entities in
a specified situation.
AIM 5 The need to collaborate
effectively with the end-user to
resolve complex problems.
S/E, AIM 8 An awareness of
the social impacts and ethical
considerations when developing
systems that potentially provide
access to sensitive data.
A.2.18 Explain how a query can provide a
view of a database.
3
A.2.19 Describe the difference between a
simple and complex query.
2 Students will be expected to be able
to:

use Boolean operators such as
AND, OR, NOT

create parameter queries

create derived fields.
MYP Mathematics: forms of
numbers, algebra—patterns and
sequences, logic, algorithms.

A.2.20 Outline the different methods that can
be used to construct a query.
2 Students will not be expected to be
able to write queries in SQL.
Students are expected to be aware
of the language as a tool for data
interrogation.
MYP Mathematics: forms of
numbers, algebra—patterns and
sequences, logic, algorithms.

A.3 Further aspects of database management (10 hours)
Assessment statement Obj Teacher's notes
A.3.1 Explain the role of a database
administrator.
3 A visiting speaker or the school
network manager could discuss his
or her role.
S/E Issues relating to privacy,
security and integrity of data.
LINK Systems in organizations.
A.3.2 Explain how end-users can interact
with a database.
3 Practical activities to demonstrate
SQL, QBE, visual queries, natural
language interfaces.
A.3.3 Describe different methods of
database recovery.
2 S/E Issues relating to the cost
of implementing such systems
weighed against the importance of
the data.
A.3.4 Outline how integrated database
systems function.
3 AIM 9 An appreciation of the
continued developments in
computer systems.
A.3.5 Outline the use of databases in areas
such as stock control, police records,
health records, employee data.
2 S/E Issues relating to privacy,
security and integrity of data.
LINK Systems in organizations.
A.3.6 Suggest methods to ensure the
privacy of the personal data and
the responsibility of those holding
personal data not to sell or divulge it in
any way.
3 Students must be aware of the
implications of large database
systems.
Principles of legislation such as Data
Protection Act and Computer Misuse
Act should be addressed (it is not
necessary to study country-specific
legislation).
S/E Issues relating to privacy,
security and integrity of data.
LINK Systems in organizations.

A.3.7 Discuss the need for some databases
to be open to interrogation by other
parties (police, government, etc).
3 S/E Issues relating to privacy,
security and integrity of data.
LINK Systems in organizations.
AIM 8 An awareness of the social
impacts and ethical considerations
of holding large quantities of data.
A.3.8 Explain the difference between data
matching and data mining.
A.4 Further database models and database analysis (15 hours)
Assessment statement Obj Teacher's notes
A.4.1 Describe the characteristics of different
database models.
3 Database models should include:

relational

object-oriented

network

spatial

multi-dimensional.
Students will be expected to refer to
actual examples in their descriptions.
A.4.2 Evaluate the use of object-oriented
databases as opposed to relational
databases.
3 This may include references to
data definition, manipulation and
integrity.
A.4.3 Define the term data warehouse. 1 Subject oriented, integrated, time-
variant and non-volatile collection of
data used in decision-making.
A.4.4 Describe a range of situations suitable
for data warehousing.
2 For example, strategic planning,
business modelling.
A.4.5 Explain why data warehousing is time
dependent.
3 Data in a warehouse is only valid for
a period of time.
A 4.6 Describe how data in a warehouse is
updated in real time.
2 Data is refreshed from data in
operational systems.
A.4.7 Describe the advantages of using data
warehousing.
2 A single manageable structure to
support decision-making. Allows
complex queries to be run across a
number of business areas.

Computer science guide


Syllabus content

Assessment statement Obj Teacher's notes
A.4.8 Explain the need for ETL processes in
data warehousing.
3 Students should understand that
processes are necessary to Extract
data from disparate sources,
Transform the data into a uniform
format for specialized processing
and Load the extracted data into the
data warehouse.
A.4.9 Describe how ETL processes can
be used to clean up data for a data
warehouse.
2 Examples should be used to show
how disparate data can be changed
to a uniform format in order to be
suitable for analysis.
A.4.10 Compare the different forms of
discovering patterns using data
mining.
3 Students are expected to be able to
describe the conceptual approach
used by:

cluster analysis

associations

classifications

sequential patterns

forecasting.
The student does not need
to understand the detailed
implementation of these methods.
AIM 8 An awareness of the social
impacts and ethical considerations
when data mining.
A.4.11 Describe situations that benefit from
data mining.
2 Examples can be cited such as the
use of mining techniques by banks
to identify fraudulent credit card use;
retailers can use mining techniques
to identify subsets of the population
likely to respond to a particular
promotion.
A.4.12 Describe how predictive modelling is
used.
2 The use of classification techniques
such as "decision tree induction"
or "backpropogation in neural
networks". The determination of
values for rows of a database useful
for predictions.
A 4.13 Explain the nature of database
segmentation.
3 The partitioning of a database
according to some feature in
common in the rows.
A 4.14 Explain the nature and purpose of link
analysis.
3 The use of rules to establish
associations between individual
records in a data set.

A 4.15 Describe the process of deviation
detection.
2 The detection of outlying data can
be subjected to statistical techniques
in order to identify unusual events or
data subsets.