Object Oriented Programming (OOP) is something that you have to learn by doing. You can read all you want about how a Car can be a subclass of a Vehicle, and buzzwords like “polymorphism”, but until you rewire your brain to think in a different way you won’t fully understand it.
In this post, I’ll walk you through a real problem, strip out the jargon, and show you how you can understand OOP by changing the way you think about your program.
Content Overview
- Prerequisites
- Problem Scenario
- What NOT to do
- What you SHOULD do
- Conclusion
Prerequisites
This isn’t for complete newcomers to OOP, although you may still find it useful. Perhaps you’ve never really got OOP, and you struggle to design object-oriented programs yourself.
You should have a basic working knowledge of OOP - for example, what a constructor is, and the meaning of public and private.
Problem Scenario
We’ll work through a coding test I once had to do for a job application. You are given a text file that represents a maze:
XXXXXXXXXXXXXXX
X             X
X XXXXXXXXXXX X
X XS        X X
X XXXXXXXXX X X
X XXXXXXXXX X X
X XXXX      X X
X XXXX XXXX X X
X XXXX XXXX X X
X X    XXXXXX X
X X XXXXXXXXX X
X X XXXXXXXXX X
X X         X X
X XXXXXXXXX   X
XEXXXXXXXXXXXXX
`X` represents a wall, `S` is the ‘start’, and `E` is the ‘end’. Spaces are where your ‘explorer’ can travel through the maze.
The task is to write a program that fulfills the following criteria:
- Given a maze, the explorer should be able to drop into the Start location (facing north)
- An explorer in a maze must be able to move forward, turn left, and turn right
- An explorer must be able to declare what is in front of them
- An explorer must be able to declare all ‘movement options’ given their location
- An explorer must be able to declare how many steps they have taken so far
- An explorer must be able to report a record of where they have been in an understandable fashion
What NOT to do
In my early career I would have made the following mistakes when thinking about this problem:
- I would have thought of my program as a sequence of instructions for the computer to carry out
- I would have thought about how I would model the data in the computer’s memory
- I would have imagined a user running the program and interacting with it
These are a bit abstract, so let me explain them.
I would have thought of my program as a sequence of instructions for the computer to carry out
I first learned to program in C, and so my mental model of programming was writing a sequence of step-by-step instructions for the computer to execute. You may have started learning with Python or JavaScript, in which case your mental model will be the same. Unless the very first language you ever used was Java or C#, you would have thought of your programs as a sequence of instructions. Data comes into the program, each line in the program performs operations on that data, and an output is produced. This is called imperative programming.
I would probably have thought “What is the very first thing that the program needs to do?…..it needs to read the text file”. And I would have written code - in a main function - that read the file.
I’ll use Java as an example, but all this applies to any language, and there’s nothing specific to Java here.
public class Main {
    public static void main(String[] args) {
        String maze = new String(Files.readAllBytes(Paths.get("maze.txt")));
}
I would have thought about how I would model the data in the computer’s memory
Then I would have considered the data, and how it would be stored. I’d perhaps store the data in a 2D array, which makes sense given the maze is a 2D grid of characters.
public class Main {
    public static void main(String[] args) {
        List<String> lines = Files.readAllLines(Paths.get(filePath));
        char[][] maze = new char[lines.size()][];
        for (int i = 0; i < lines.size(); i++) {
            maze[i] = lines.get(i).toCharArray();
        }
}
I would have imagined a user running the program and interacting with it
I would have looked at the specification and considered how the user would interact with the program. So the first one: Given a maze, the explorer should be able to drop into the Start location (facing north). I probably would have done something like this:
 public class Main {
    public static void main(String[] args) {
        List<String> lines = Files.readAllLines(Paths.get(filePath));
        char[][] maze = new char[lines.size()][];
        for (int i = 0; i < lines.size(); i++) {
            maze[i] = lines.get(i).toCharArray();
        }
        int[] startCoordinates = getStartCoordinates(maze);
        int[] explorerCoordinates = startCoordinates;
        char explorerDirection = 'N';
    }
    private int[] getStartCoordinates(char[][] maze) {
        // search grid and return coordinates of 'S' [x, y]
}
Then maybe I would have given the user a way to interact with the program:
public class Main {
    public static void main(String[] args) {
        ...
        int[] startCoordinates = getStartCoordinates(maze);
        int[] explorerCoordinates = startCoordinates;
        char explorerDirection = 'N';
        String userInput = getUserInput();
        
        if (userInput.equals("turn left") {
            explorerDirection = 'W';
            ...
    }
    ...
}
..and so on. At some point, I’d have to have some kind of input loop and kept the state maintained according to the user input, but you get the idea.
The point is: I would have primarily had in my mind the idea that this is an actual program that could be used by a real user. I would have modeled the data at a low level. I would have tackled the problem step by step, from when the program is first run.#
Perhaps you’re now thinking: but programs are made for real users, you should think of how data is represented in memory and a program is a sequence of instructions!
If you’re thinking this then you are of course correct and you there is hope for you yet.
You would be amazed at how many people don’t think of programs this way. As a reminder:
But: our job for today is to understand OOP and not get fired. Thinking of programs like this will prevent you from properly understanding OOP. You need to rewire your brain and think about your programs a little differently, if only temporarily.
What you SHOULD do
The first thing is to move away from the computer and get a pen and some paper. What are the ‘things’ in this problem? Write them down.
Maze
Explorer
Great. These are our ‘objects’.
class Maze {
}
class Explorer {
}
Now let’s consider the two things that objects have: state and methods.
State
What ‘state’ can a maze be in? It’s the state defined by the text file. It could be this:
XXXXXXXXXXXXXXX
X             X
X XXXXXXXXXXX X
X XS        X X
X XXXXXXXXX X X
X XXXXXXXXX X X
X XXXX      X X
X XXXX XXXX X X
X XXXX XXXX X X
X X    XXXXXX X
X X XXXXXXXXX X
X X XXXXXXXXX X
X X         X X
X XXXXXXXXX   X
XEXXXXXXXXXXEXX
Or it could be this:
XXXXXXXXXXXXXXX
X             X
X XXX     SXX X
X X         X X
X XX   XXXX X X
X XX   XXXX X X
X XXXX      X X
X XXXX XXXX X X
X XXXX XXXX X X
X X    XXXXXX X
X X XXXXXXXXX X
X X XXXXXXXXX X
X X         X X
X XXXXXXXXX   X
XXXXXXXXXXXXXXX
So how can we represent that in our `Maze` object? How about a 2D array of characters:
class Maze {
    char[][] grid;
}
That’s fine: the grid literally is a 2D array of characters. But we can go further and pick out some more objects from our problem. What about the allowed characters in the maze and their meaning?
We can define a `MazeElement`, like this:
enum MazeElement {
    WALL('X'),
    SPACE(' '),
    START('S'),
    EXIT('E');
}
And now our `Maze` is this:
class Maze {
    MazeElement[][] grid;
}
This has some benefits. For example, instead of writing something like this:
if (grid[i][j] == 'X')
..we can instead write this:
if (grid[i][j] == MazeElement.WALL)
Now we’re working explicitly in the problem domain. Instead of the reader perhaps not knowing what `X` was, or having to check for themselves which characters are allowed, now we’re explicitly telling them.
That’s all we need to represent the state of the maze at any time. Now what about the Explorer? What ‘state’ can the Explorer have?
Well, the Explorer has a position in the Maze, which we can represent with grid coordinates. What’s the best ‘type’ to represent that? The primitive example would be to use an array of two integers, but we can do better than that. Most languages will have their own classes to represent coordinates, as it’s such a common problem. In Java, we can use the Point class, from `java.awt`.
class Explorer {
    Point currentLocation;
}
The Maze itself is also part of the Explorer’s state. Perhaps there are multiple mazes and the Explorer could be in any one of them.
class Explorer {
    Maze currentMaze;
    Point currentLocation;
}
We know from the problem spec that the Explorer can face in different directions:
class Explorer {
    Maze currentMaze;
    Point currentLocation;
    String directionFacing;
}
where the direction can be north, south, east or west. But like the MazeElements we can go one better, and define another object:
enum Direction {
    NORTH,
    SOUTH,
    EAST,
    WEST;
}
then the Explorer becomes:
class Explorer {
    Maze currentMaze;
    Point currentLocation;
    Direction directionFacing;
}
Again looking at the problem spec the Explorer has a ‘history’: the path they have taken and the number of steps. So we can add:
class Explorer {
    Maze currentMaze;
    Point currentLocation;
    Direction directionFacing;
    int stepCount;
    List<Direction> path;
}
Actions
We’ve defined all our objects, and their state:
Maze
    grid
    
Explorer
    currentMaze
    currentLocation
    directionFacing
    stepCount
    path
MazeElement
    WALL, SPACE, START, EXIT
Direction
    NORTH, SOUTH, EAST, WEST
As well as state, objects do things. Or more accurately, things they allow you to do to them.
So let’s go through the problem spec to see what we need them to let us do.
Given a maze the explorer should be able to drop in to the Start location (facing north)
class Explorer {
    ...
    void enterMaze(Maze maze) {
        // TODO
    }
}
We’re not going to implement the method yet as we’re still designing our objects.
Remember to sketch out your objects up front, before you start writing any real code.
An explorer in a maze must be able to move forward, turn left or turn right
Now we’re seeing how easy this is:
class Explorer {
    ...
    void enterMaze(Maze maze) {
        // TODO
    }
    void moveForward() {
        // TODO
    }
    void turnLeft() {
        // TODO
    }
   
    void turnRight() {
        // TODO
    }
}
Note that each of these methods will change the state of the Explorer. enterMaze will set currentMaze, directionFacing and currentLocation. moveForward will increment stepCount, add to the path and change the currentLocation.
An explorer must be able to declare what is in front of them
class Explorer {
    
    ...
    MazeElement declareWhatIsInFront() {
        // TODO
    }    
}
An explorer must be able to declare all movement options from their given location
We might have questions about what exactly this means, but it doesn’t matter for now.
class Explorer {
    
    ...
    Set<Direction> declareMovementOptions() {
        // TODO
    }    
}
An explorer must be able to declare the number of times they have moved forward so far
class Explorer {
    
    ...
    int declareNumberOfSteps() {
        // TODO
    }    
}
An explorer must be able to report a record of where they have been in an understandable fashion
class Explorer {
    
    ...
    List<Direction> declarePath() {
        // TODO
    }    
}
Notice our method names have “declare” in them a lot. This matches the language of the spec, but actually it’s a little unclear what “declare” means. Does it mean “return” or could it mean “print”? To make things clearer we can change “declare” to “get”, so our object now looks like this:
class Explorer {
    Maze currentMaze;
    Point currentLocation;
    Direction directionFacing;
    int stepCount;
    List<Direction> path;
    void enterMaze(Maze maze) {}
    void moveForward() {}
    void turnLeft() {}
    void turnRight() {}
    MazeElement getMazeElementInFront() {}
    Set<Direction> getMovementOptions() {}
    int getNumberOfSteps() {}
    List<Direction> getPath() {}
}
Public vs Private
Looking at the Explorer object above, we only need to “expose” the methods to the user (the user of the class) to fulfil the spec. We don’t need the user to be able to access the state - they do that through the methods.
class Explorer {
    private Maze currentMaze;
    private Point currentLocation;
    private Direction directionFacing;
    private int stepCount;
    private List<Direction> path;
    public void enterMaze(Maze maze) {}
    public void moveForward() {}
    public void turnLeft() {}
    public void turnRight() {}
    public MazeElement getMazeElementInFront() {}
    public Set<Direction> getMovementOptions() {}
    public int getNumberOfSteps() {}
    public List<Direction> getPath() {}
}
In general, state is always private, and methods are usually public.
The public methods define the interface: how a user of the object interacts with it.
Let’s say though that when writing the public methods we find we’ve got repeated code in a few of them, and we want to extract that into a different method. Let’s say the new method is this:
Point getPointInDirectionOf(Direction d) {}
which returns the Point in a given direction. This could be used by the moveForward method, declareWhatsInFront and declareMovementOptions.
This method should be private. It’s only used internally, by the object itself. The ‘user’ of the class doesn’t need to access that method. They don’t even need to know that it exists, because it doesn’t affect the behaviour of the class at all.
This is called encapsulation. The ‘inner workings’ of the object are not exposed to the ‘user’ of the object.
class Explorer {
    private Maze currentMaze;
    private Point currentLocation;
    private Direction directionFacing;
    private int stepCount;
    private List<Direction> path;
    public void enterMaze(Maze maze) {}
    public void moveForward() {}
    public void turnLeft() {}
    public void turnRight() {}
    public MazeElement getMazeElementInFront() {}
    public Set<Direction> getMovementOptions() {}
    public int getNumberOfSteps() {}
    public List<Direction> getPath() {}
    private Point getPointInDirectionOf(Direction d) {}
}
This private method isn’t something you’ll be able to plan up front - you’ll only encounter it when you get to actually implementing the methods. I include it here simply to explain public and private methods.
As you begin to write the actual code you’ll also discover things you need the Maze to do, which aren’t necessarily in the problem spec. For example, the Explorer is going to have to ‘ask’ the Maze “which elements are at coordinate x, y?”. In that case the method on the Maze should be public, because that functionality is being made available to other objects, in this case to the Explorer.
Constructors
All classes need a constructor. The function of the constructor is to initialise the object’s state.
In the case of the Maze, we need to set the grid, which we read from a text file.
public Maze (String fileName) {
    // TODO: iterate though characters in the file and assign them to the grid
} 
For the Explorer, what state do we need to set?
private Maze currentMaze;
private Point currentLocation;
private Direction directionFacing;
private int stepCount;
private List<Direction> path;
We already have a separate method enterMaze which will assign currentMaze (although you may choose to do that in the constructor instead if you wanted), so the currentMaze can be initialised to null.
The other fields can be initialised to null in the case of the Point and Direction, 0 for stepCount, and an empty list for path.
public Explorer() {
    currentMaze = null;
    currentLocation = null;
    directionFacing = null;
    stepCount = 0;
    path = new ArrayList<>();
}
Conclusion
Our objects look like this:
class Maze {
    private MazeElement[][] grid;
    public Maze (String fileName) {}
    public MazeElement getElementAtCoordinate(int x, int y) {}
}
class Explorer {
    private Maze currentMaze;
    private Point currentLocation;
    private Direction directionFacing;
    private int stepCount;
    private List<Direction> path;
    public Explorer() {}
    public void enterMaze(Maze maze) {}
    public void moveForward() {}
    public void turnLeft() {}
    public void turnRight() {}
    public MazeElement getMazeElementInFront() {}
    public Set<Direction> getMovementOptions() {}
    public int getNumberOfSteps() {}
    public List<Direction> getPath() {}
}
enum MazeElement {
    WALL('X'),
    SPACE(' '),
    START('S'),
    EXIT('E');
}
enum Direction {
    NORTH,
    SOUTH,
    EAST,
    WEST;
}
Note that we haven’t written any actual code yet - this is all just designing our objects up front.
Notice how different that is from our first tactic of seeing the program as a sequence of instructions executed one by one.
In fact, there’s almost nothing here that’s language specific. You can apply the principles here to almost any object-oriented language.
By thinking about your objects up front you’ve simplified the task massively. Now you can tackle each part of the problem spec in isolation. This relieves you of the burden of having to keep the whole problem in your head at once, which you’d have to do if you were writing imperative code that goes from top to bottom.
Note as well that we haven’t considered how a user would run and interact with this problem. The problem spec doesn’t ask for that. When writing OOP you need to think more abstractly, and tackle small pieces of the problem in isolation. If we did want to add a UI on later that would be easy to do in a new object, that interacted with our Maze and Explorer via their interfaces.
Also published here.
Next
In my next post, I’ll use this same Maze example to show you how to use Test Driven Development (TDD) to make your life even easier when implementing the methods. To make sure you don’t miss it, follow me on Twitter.
