I’ve started my programming journey more than 8 years ago with a book on learning programming in Java (specifically Bruce Eckel thinking in java). I chose Java because I was fortunate enough to have an older brother who already worked as a software engineer and he explained to me in simple terms how to choose a programming language. And at that time I was basically thinking between Java and C++. I eventually chose Java due to simple fact that it allowed me to ignore memory management and focus on more conceptual OOP learning, and all in all Java had a much easier learning curve than C++.

Now as my learning progressed as any software developer I’ve started to dive into a myriad of topics non related to Java - SQL, NoSQL, deployment, cloud, Spring Framework etc. In other words I’ve started preparing for a “real” world job, which I thought would require broad knowledge in multiple software domains, and as any developer reading this knows - this grind never stops. You master one skill - and 2 new skills appear. You read a book on architecture and come to a conclusion that you need to order 3 more to cover your newly identified knowledge gaps.

But recently I’ve rediscovered my love for learning the basics - plain old Java language. Java programming language is written by much more smarter devs than you and me and as such we’ve benefited greatly from the automatic memory management that it ships with. But in this blog post I would like to revise again the underlying memory model that Java uses, as well as inner workings of Garbage Collector. Knowledge of inner workings of these structure help a lot with understanding multi-threading pitfalls, as well as gives one ideas on how to tune and improve large applications.

Memory model

In Java Memory is organised into two large spaces: the stack and the heap (there is also a small third segment of native memory called Metaspace used to store class metadata). Please note that the sizes in the diagram are not up to scale, in reality the Heap space is much larger than the stack.

Stack memory

The stack memory is used by java for static memory allocation and the thread execution. Each new called method pushes a block onto the stack memory, it also stores primitive values and references to objects created with the keyword new. Data on the stack is ordered and is organised using LIFO (Last In First Out) structure. Stack shrinks and grows together with the code execution. For example the following sample code will look as follows in memory:

Some other features of stack memory include:

Heap space

This region of memory stores the actual object instances. Local variables on the stack hold references that point to these objects. Consider the following line of code:

Person person = new Person();

The new keyword allocates space on the heap for a new Person object, constructs it, and returns a reference to it. This reference is then stored in the local variable person on the stack.

The heap is a single, shared memory area for the entire JVM process. All threads access the same heap, regardless of how many are running. As such thread safety of this space is up to the application developer. The heap is divided into 3 big chunks for facilitation of garbage collection:

Metaspace

After Java 8 was introduced a new memory segment replaced PermGen and class metadata now lives in Metaspace, which uses native memory, not heap.

Object Reference types

Not many people know but Java actually supports multiple types of references that affect the way garbage collector works. Let’s consider each one.

Strong reference

Default and the most popular reference types that we all are used to. In the example above with the Person, we actually hold a strong reference to an object from the heap. The object on the heap it is not garbage collected while there is a strong reference pointing to it, or if it is strongly reachable through a chain of strong references. In other words, any object unless specifically wrapped in other reference type will be considered as strong.

Weak reference

A weak reference will most likely not survive the next pass of Garbage Collection. A classic use case for weak reference is caches, if memory gets tight GC will clean these objects before crashing with OutOfMemoryError. Weak reference are intialised as follows:

WeakReference<Person> reference = new WeakReference<>(new Person());

There is actually a WeakHashMap in the official java collection API that is using this exact concept as keys to that map. Once a key from the WeakHashMap is garbage collected, the entire entry is removed from the map.

Soft reference

Soft references are useful in memory-sensitive situations. Objects referenced softly are only reclaimed when the JVM is under memory pressure. As long as there is enough free space available, the garbage collector leaves these objects untouched. Before the JVM ever throws an OutOfMemoryError, it is guaranteed to clear all soft-referenced objects. As the JavaDocs put it, “all soft references to softly-reachable objects are guaranteed to have been cleared before the virtual machine throws an OutOfMemoryError.”

Similarly to Weak, Soft reference is created as follows:

SoftReference<Person> reference = new SoftReference<>(new Person());

Difference is that Soft references won’t be cleared unless memory is running out, while Weak reference are most likely cleared once they are reached on next pass of the GC. So soft = less aggressive, weak = more aggressive.

Phantom reference

Phantom references are used to track the lifecycle of an object after it has been finalised and is about to be reclaimed by the garbage collector. Unlike soft or weak references, phantom references cannot be dereferenced and are used together with a ReferenceQueue to perform post-mortem cleanup, typically for native or off-heap resources.

Used only with a reference queue, since the .get() method of such references will always return null. These types of references are considered preferable to finalisers.

ReferenceQueue<Person> queue = new ReferenceQueue<>();
Person p = new Person();
PhantomReference<Person> phantom = new PhantomReference<>(p, queue);
phantom.get(); // always returns null

Phantom references are used when an object owns resources outside the Java heap (e.g., native memory, file handles, GPU buffers, sockets, mmap’d files). Those resources must be freed after the JVM is sure the object is truly unreachable.

Java’s GC only knows how to free heap objects — it has no idea how to free native/off-heap memory. Phantom references provide a reliable notification point.

Garbage Collection Process

So now that we’ve talked about where objects live and how references work, it’s time to look at the “invisible janitor” that actually frees memory for us: the Garbage Collector.

At a very high level, the GC does one thing - find objects that are no longer reachable from any running code, and reclaim their memory. The exact implementation depends on the chosen collector, but most modern HotSpot GCs are based on some variation of mark-and-sweep with generational optimisations.

Mark-and-sweep in practice

The classic mark-and-sweep algorithm works roughly like this:

  1. Stop the world

    The JVM briefly pauses all application threads. This is called a stop-the-world pause. During this time, only GC threads run, so your application doesn’t process requests, handle user input, etc.

  2. Find GC roots

    The JVM identifies a set of well-known entry points called GC roots. Examples include:

    • local variables on each threads stack
    • static fields of loaded classes
    • JIT/compiler-internal references
    • JNI references, etc.
  3. Mark phase

    Starting from these roots, the GC traverses the object graph - it follows all references from roots into the heap, every object it visits it “marks” as reachable, from each marked object it recursively follows outgoing objects marking those objects too. After this phase the GC knows which objects are still in use (i.e. marked) and which are effectively dead (unmarked).

  4. Sweep (and optionally compact) phase

    Once marking is done the GC scans through heap regions and reclaims memory for unmarked objects. Many collectors also compact memory: they move surviving objects together to reduce memory fragmentation and update references to them. Compaction is important because over time, without it the heap becomes fragmented - lots of small free blocks scattered everywhere - making large allocation harder.

  5. Resume application threads

    After reclaiming and possibly compacting memory, the GC resumes all application threads. Program execution continues as if nothing happened – except with more free memory.

This is the basic mental model to keep in mind:

Stop → mark reachable → free unreachable → (maybe compact) → resume.

In reality, modern GCs do this in more sophisticated ways (incrementally, concurrently, in different regions of the heap), but they all revolve around the same core idea.

Minor vs Major GC

Because the heap is split into young and old generations, GC also runs in two flavours:

When people complain “GC pauses are killing my app”, they’re usually suffering from too many or too long old-gen collections.

You can’t really force GC

Java gives you methods like:

System.gc();
Runtime.getRuntime().gc();

But these are polite suggestions, not commands. The JVM is free to run a GC immediately or defer it or even outright ignore your request (and some JVMs are configured to do exactly that in production). The reasoning is simple: the JVM has far more information than your application about heap state, allocation rates, and GC pressure. Manually sprinkling System.gc() calls almost always makes things worse, not better.

So as a rule of thumb - you don’t control when GC runs, you influence it indirectly through how many objects you create, how long they stay reachable, which data structures you use and how you configure/tune the collector.

GC is not free: stop-the-world and performance

Even with all the optimisations in modern JVMs, GC has a cost:

Modern GC algorithms in Java

Modern Java (especially JDK 11+ and 17+) ships with several different collectors, each with its own trade-offs. Very roughly:

Historically there was also CMS (Concurrent Mark Sweep) and PermGen, but both are essentially legacy at this point (CMS is deprecated/removed, PermGen replaced by Metaspace).

When you’re tuning a modern JVM, the tuning question often boils down to - What’s more important for my workload – maximum throughput or predictable low pauses? and you choose a collector accordingly (e.g., Parallel vs G1 vs ZGC/Shenandoah), then tweak heap sizes and pause targets.

Configuring JVM Memory and Garbage Collector

Knowing how the JVM manages memory is great, but at some point you’ll want to take control: set heap sizes, choose a GC, and tune it for your specific workload. The good news is that most of this is done via JVM flags, so you can experiment easily.

1. Core memory settings: heap and stack

The two most important knobs are:

Example:

java -Xms1g -Xmx1g -jar my-app.jar

This starts the app with a fixed 1 GB heap (initial = max). Using the same value for -Xms and -Xmx is common for server apps because it avoids heap resizing during runtime.

You can use m or g suffixes:

java -Xms512m -Xmx2g -jar my-app.jar

This starts with 512 MB heap and allows it to grow up to 2 GB.


Thestack size per thread is controlled by:

java -Xss512k -jar my-app.jar

Be careful: smaller stacks allow more threads but also make stack overflows easier to hit in recursive code.

2. Selecting a garbage collector

Modern JVMs pick a reasonable default GC (e.g., G1), but you can explicitly choose one if your workload has specific requirements.

Common options:

Low-latency collectors (JDK 11+ / 17+):

3. Basic GC tuning knobs

Once a GC is chosen, you can give it some hints. You usually don’t want to micro-tune everything, but a few options are very commonly used.

Target GC pause times (G1 / ZGC / Shenandoah)

For G1, you can specify a desired maximum pause time:

-XX:+UseG1GC
-XX:MaxGCPauseMillis=200

This says “please try to keep GC pauses under ~200 ms”. It’s not a hard guarantee, but G1 will aim for it by adjusting region sizes, concurrent work, etc.

For very low latency with ZGC, you typically just configure heap size and let it do its thing:

-XX:+UseZGC 
-Xms4g 
-Xmx4g

Young/old generation balance (simplified)

Older flags like -XX:NewRatio or explicit young-gen sizes still exist, but for G1 and newer collectors you often let the GC decide. If you really want to tweak:

-XX:+UseParallelGC
-XX:NewRatio=3  # Young gen ~1/4 of heap, old gen ~3/4

For G1, a more relevant knob is when it starts collecting the old generation:

-XX:+UseG1GC
-XX:InitiatingHeapOccupancyPercent=45

This means G1 starts concurrent marking when ~45% of the heap is occupied, rather than waiting for it to fill up more.

4. Enabling GC logging

You can’t tune what you can’t see. GC logs are essential to understanding what your GC is actually doing in production.

-Xlog:gc*:file=gc.log:time,uptime,level,tags

This prints detailed GC info to gc.log. You can later analyse this with tools like GCViewer or gceasy (or even simple grep/awk/Excel if you’re brave).

If you’re on an older JDK, you might still see:

-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-Xloggc:gc.log

You don’t have to memorise every single flag, but knowing the “big four”:

already puts you ahead of many Java developers. Combined with an understanding of the memory model and reference types, it gives you a solid foundation for diagnosing memory leaks, GC storms, and “mysterious” latency spikes in real-world systems.

Conclusion

Knowing how memory is organised gives you a very real advantage when it comes to writing correct, efficient, and scalable software. Once you understand how the JVM allocates memory, how objects move through generations, and how the GC reclaims space, you can start making deliberate choices instead of accidental ones. This knowledge also opens the door to tuning the JVM itself. By selecting the right garbage collector and configuring heap sizes, pause targets, and GC behaviour, you can adapt the runtime to the specific needs of your application — whether that’s low latency, high throughput, or simply better stability under load. With the right tools (profilers, GC logs, JFR, etc.), fixing issues like memory issues and GC burns becomes not just possible but straightforward.