Recap

Hey guys! If you’re new here, I am running a 6 month long experiment to see if a Large Language Model (like ChatGPT) can be a skilled micro-cap portfolio manager. I give it daily closing data at the end of every trading day and it has full control over its assets. Also, once every week it gets to use Deep Research to completely reevaluate it’s account. Can ChatGPT carve consistent alpha in the dangerous world of micro-cap stocks? Lets find out.

Quick Note

As mentioned in the subtitle, this post was actually meant for last week, however I forgot to schedule the upload. The real post for this week will be up either today or in the next few days only on my profile (I don’t want to spam you all). Sorry for the inconvenience!

Overview

The week immediately started with deep equity losses Monday and Tuesday, and the portfolio was briefly down almost 10% from last week. However, some losses were gained back the following three days.

Performance Graph

Metrics

[ Risk & Return ]

Max Drawdown: -50.33% on 2025-11-06

Sharpe Ratio (period): -0.4439

Sharpe Ratio (annualized): -0.2888

Sortino Ratio (period): -0.5166

Sortino Ratio (annualized): -0.3360

[ CAPM vs Benchmarks ]

Beta (daily) vs ^GSPC: 0.9202

Alpha (annualized) vs ^GSPC: -34.84%

R² (fit quality): 0.017 Obs: 110

Note: Short sample and/or low R² — alpha/beta may be unstable.

[ Snapshot ]

Latest ChatGPT Equity: $ 82.59

$100.0 in S&P 500 (same window): $ 110.72

Cash Balance: $ 22.45

Current Portfolio

ticker shares buy_price cost_basis stop_loss PnL

MIST 14.0 1.75 24.50 1.6 13.02

SLS 13.0 1.41 18.33 1.1 4.29

Portfolio Review

To see the full report: Click Here

Thesis Review Summary

We deliberately constructed the portfolio in recent weeks as a concentrated bet on three independent biotech catalysts. This was a pivot away from a more diversified approach earlier in the experiment, driven by the need to attempt a dramatic catch-up to the benchmark (we’re currently behind the S&P 500 by a significant margin). The underlying logic is:

These catalyst outcomes are uncorrelated with each other (heart drug approval, anxiety drug trial, cancer vaccine trial – totally different indications and mechanisms). Thus, they represent a form of diversification by event. The success or failure of one does not influence the others. By focusing on binary events, we set up the possibility of large upside moves. A single biotech win can double or triple a stock overnight. The trade-off is high volatility and downside risk, which we have seen (our portfolio saw ~-50% drawdown at one point). However, we mitigated risk by using stop-losses and by sizing – we didn’t “go all in” on any one play but spread across three. We recognized that to have any shot at beating the S&P by year-end after earlier losses, we needed some multi-bagger opportunities. Simply holding stable stocks wouldn’t cut it in the time remaining. This strategy, while risky, was aimed at that asymmetric payoff.

At a portfolio level, the outcomes we envision: - One big win out of three could narrow the performance gap substantially. For instance, if MIST alone doubles on approval while the others tread water or hit stops, our portfolio value would jump significantly (likely bringing us close to breakeven vs. starting $100, if not above). - Two wins (say MIST and VTGN both succeed) could potentially push the portfolio ahead of the S&P 500. The math: a 100% gain on MIST and, say, 80-100% gain on VTGN would more than compensate for maybe a 50% loss on SLS. That scenario could vault the portfolio to new highs. - All three winning is the grand-slam scenario (rare, but not impossible) that would far exceed the benchmark. Conversely, zero wins would mean we take stops on all and likely end well below the benchmark – essentially the gamble doesn’t pay off. We understand this approach is akin to a venture capital style or hedge-fund style swing for the fences. It’s not a conventional steady strategy. But given the context (an experiment with a fixed end date and us trailing), it was a calculated decision. Importantly, we set it up so that even in a total bust, the portfolio survives (thanks to stops, we’d still have some cash left to end with something like ~$50-$60 if all failed, rather than $0).

We also consciously decided not to hedge these catalyst plays with, say, index shorts or diversified longs, because that would dilute the upside. The whole point was to maximize the impact of any wins. We accept the downside as the cost. Now, as we enter what is likely the climax of this experiment, the portfolio is positioned for potentially wild swings. We have plans in place for either outcome of each event. The next two weeks (Week 24 and 25) will likely determine the final outcome. In summary, our portfolio-level thesis is: concentrate risk into a few uncorrelated high-impact bets, manage the downside with stops, and let the upside run. This gives us a fighting chance to outperform dramatically, at the cost of higher volatility – a risk we chose knowingly.

Future Plans & Progress

I’m excited to report have also made massive progress on the LLM benchmark. The architecture for portfolio handling and execution is completely finished; all I have left is assembling workflow, finishing basic metrics, and documentation. This system is far more advanced and polished than my current setup, so everything from data analysis to automation will become massively upgraded for all future projects.


Also, these updates are meant to be quick snapshots of progress. After Week 26, I’ll release a comprehensive research report with deeper analysis, visualization, results and a breakdown of what worked and what could be optimized. I’ll post it on here and X, so stay tuned!

Full chats: Here

Have a question? Check out: Q&A

If you’d like to see the raw logs and full portfolio simulation code: GitHub Page

My X account: NathanS729

If you have any suggestions or advice, my Gmail is: [email protected]

Disclaimer

This project is purely educational and research-focused. Nothing here should be taken as financial advice. Full disclaimer: Here