Computer Methods of Classification.

Given by Susan Laflin.

This course was given as a 10-credit optional module for the M.Sc students and was last given in 1998. The textbook for the course and some examples of the assessment are included here. A few printed copies of the booklet are available (free) from the library.

Objective: To introduce the concepts of computer-based classification and give the students some experience in this area.

Description: Introduction to the concepts and termonology of classification. Data matrices and the calculation of similarity or distance coefficicients and matrices. The idea of cluster analysis. SAHN methods and their validity. Examples of agglomerative and divisive methods. Definition of "density" in attribute space and associated concepts. Density methods and mode-analysis. Seriation methods - principle component analysis, multi-dimensional scaling and associated concepts. Comparison of shapes, methods of recording profiles. Tangent, chain-code and B-spline methods. Open and closed profiles and orientation problems.

Delivery: 18 hours of lectures and practical sessions.

Assessment:Written report on a case study.

Key Texts: Handbook for the course. (This handbook may also be found below).

The online information for the course contained two folders of information for the students. These are no longer available.

The folder "Program" contained the executable code (PROGB) and associated files. When it was run, it displayed in turn each set of data and also wrote the coordinates to a file. Typing in a digit moved to the next data set until all seven sets had been generated. The coordinates were srored in the files "SET1" to "SET7" and were used for the exercises.
The software (written in Pascal to run on the student laboratory then in use) no longer works, but one set of data is available on request.

The folder "Lectures" contained the powerpoint slides for each of the seven formal lectures in the course. The remainder of the hours were made up of tutorial periods and time spent in the computer laboratory writing software. Each lecture dealt with the same topic as one of the chapters in the handbook and covered the same topics. Consequently they duplicate the chapters given below and are not included.

Chapter 1. Basic Concepts.
Chapter 2. Similarity and Distance.
Chapter 3. Cluster Analysis.
Chapter 4. Sequential Methods.
Chapter 5. Seriation Methods.
Chapter 6. Density Methods.
Chapter 7. Comparing Shapes.
Examples of the case studies used in various years.

Copyright (c) Susan Laflin. 1998.