University of Arizona, Department of Computer Science

CSc 120: Basketball Data Analysis

This problem involves writing a program to analyze historical win-loss data for a single season of Division I NCAA women’s basketball teams and compute from this the win ratio for each team as well as the conference(s) with the highest average win ratio.

Restrictions

  1. For the long problems, your code should follow the style guidelines for the class.
  2. You may not use concepts and/or short-hand syntax not yet covered in class. The restrictions include the following:

Background

Whether it's football, basketball, lacrosse, or any other number of sports, loyal fans want to support their teams and also size up the competition. An important indicator of the quality of a team is its win ratio over a season, which is defined to be the fraction of wins in the total number of games that team played. Thus, if a team plays N games and wins M of them, then its win ratio is M/N. Assuming---as we will do for this problem---there are no tied games, a team with M wins and N losses has a win ratio of M/(M+N). The average win ratio for a conference in a given season is the average of the win ratios of all of the teams in that conference in that season.

Expected Behavior

Write a program, in a file bball.py, that behaves as follows.

  1. Use input() (without arguments) to read the name of an input data file.
  2. Read and process the file specified (see 'Input' below for the file format). For each non-comment line in the file:
  3. Compute the average win ratio for each conference and find the conference(s) that have the highest win ratio.
  4. Print out the results of your analysis in the format given below under Output. Some examples are shown Basketball Input/Output Examples.

Input

Any line in the input file that begins with "#" is a comment and should be ignored. Each non-coment line gives win-loss data for a team and has the following format: team name (one or more words), followed by conference name in parentheses (one or more words), followed by the number of wins, followed by the number of losses. For example:
# Division I Women's Basketball: 2015-16 Season
# School (Conference)	Wins	Losses
UConn (AAC)		       38     0
UC Riverside (Big West)	23     9
St. John's (NY) (Big East)	23     10
Arizona (Pac-12)	13	19

Note that the team name may consist of multiple words, some of which may be parenthesized, e.g., "St. John’s (NY)". The conference name is given by the rightmost parenthesized group of words in the line. Make sure to look at all of the provided input files before making decisions about how to process the lines and extract the necessary data.

You may find the method s.rfind(chars) helpful. This will allow you to look for characters in s starting from the right-hand-side of the string.

Output

The output should be the conference name and win-loss average for all of the conferences with the best win ratio average, one conference per line, in the following format (the simplest way to get this into your code without any mistakes is to copy-paste it into your program and then editing the variable names appropriately):
"{} : {}".format(conf_name, conf_win_ratio_avg)
where conf_name is the name of a conference and conf_win_ratio_avg is the average win ratio for that conference. If more than one conference has the highest average win ratio, print them out in alphabetical order. Some examples are shown Basketball Input/Output Examples.

Programming Requirements

Your program should implement (at least) the following classes along with the methods specified.

class Team
An object of this class represents information about a team: namely, the team name, the conference it belongs to, and win-loss data. This class should implement the following methods:
  • __init__(self, line) : line is a line read from the input file. Initializes the object with information extracted from line. The information stored as attributes for each team should be sufficient to implement the other methods for the team specified below.
  • name(self): returns the name of the team.
  • conf(self): returns the conference the team belongs to.
  • win_ratio(self): returns the win ratio for the team.
  • __str__(self): returns a string with information about the team, in the following format:
    "{} : {}".format(name, win_ratio_str)
    where name is the name of the team and win_ratio_str is its win ratio (as a string).

class Conference
An object of this class represents information about a collection of teams, namely, the teams belonging to that conference. This class should implement the following methods:
  • __init__(self, conf) : conf is a string giving the name of a conference. Initializes a conference object with name conf. The list of teams for the conference is initialized to be empty.
  • __contains__(self, team) : team is a Team object. Returns True if team is in the list of teams for this conference; False otherwise.
  • name(self): returns the name of the conference object;
  • add(self, team): Takes a team object team as argument and adds team to the list of teams associated with the object;
  • win_ratio_avg(self): returns the average win ratio for the conference (a floating-point value).
  • __str__(self): returns a string with information about the conference, in the following format:
    "{} : {}".format(name, win_ratio_str)
    where name is the name of the conference and win_ratio_str is its average win ratio (as a string).

class ConferenceSet
An object of this class represents a collection of conferences. (You may use a list for the collection of conferences, or any other data structure of your choice.) This class should implement the following methods:
  • __init__(self) : Initializes the collection of conferences to be empty.
  • add(self, team) : team is a Team object. Adds team to the appropriate conference in the collection of conferences, if necessary creating a Conference object for the conference of this team.
  • best(self) : returns a list of conferences that have the highest average win ratio.

You may find it convenient to also implement __repr__() methods for some of these classes for debugging purposes. This is optional.

Errors

There is no ouput related to errors for this program.

Examples

Several examples of this data analysis are given Basketball Input/Output Examples.