An Interview with Graham Giller, Author of “Adventures in Financial Data Science”

Graham Giller has been a data scientist for a lot longer than that phrase has been in existence. In a new book, he reflects on some of his undertakings— which involved working with one of Wall Street’s best-known quantitative trading operations.

The work falls into a genre that is probably underserved. It is rare that “non-famous” people, as Graham refers to himself in the book, take the time to chronicle their efforts. Yet it seems highly unlikely that aspiring data scientists could not learn something from the experience of others.

I asked Graham some questions about the work, which begins with some biographical information before moving onto a series of technical discussions. Graham steered into quantitative work when it was clear that cricket would not work out.

Let’s start with your early life and your crushing disappointment when you discovered that only native-born Yorkshiremen were permitted to play for Geoff Boycott’s cricket team. Did this kill your interest in cricket statistics as well as cricket? I notice there was no further mention of the game in your book.

My interest in cricket was always more physical than observational. I wanted to play the game, which is ironic as I am not particularly athletic and physically unskilled. As a child, I tried reading my grandfather’s copies of Wisden, but it didn’t really click for me. I probably wasn’t even aware of “cricket statistics” until after I read Moneyball.

Have you been disenfranchised in any similar way since? What about the field of statistics itself? Has the rebranding of statistics been a good thing, insofar as it brings people from diverse backgrounds into statistics, including cricket ‘’data scientists’’, perhaps?

Regarding the “rebranding of statistics,” I’m very happy to see a realization that data-driven reasoning should be the anchor of business success. I’ve participated in businesses that succeed due to putting the “data people” in the driving seat and that’s definitely a good thing. I’m a little frustrated by the development of a culture of coding over modeling, though, as platforms come and go and I like my people to be interested in the data, not the code.

Your grandmother, Lorna Baugh, was an expert in PDEs or could at least bluff the same. She worked at Bletchley Park and I have to wonder if Bayesianism travels down the female side of the family. Where does that place you ethnically, if we are talking Frequentism and Bayes?

I’ve gone back and forth a bit on Bayesian vs Frequentist statistics. As I write, from experience of Quantum Field Theory, and the postulates regarding its interpretation in terms of the fundamental randomness of the Universe, I feel that the physical sciences are full of examples of good frequentist behaviour.

I recognize the intellectual attraction of Bayes, and regard Jeffrey’s Theory of Probability as an important book for every student of statistical science. But I also see the point of Fisher’s objection that if the interpretation of the outcome of an experiment is subjective then it’s just not science.

Ultimately, I’ve come around to a more “Fisherian” point of view on the foundations of statistics which is based on entropy and information theory. I feel that the Keynes/Jeffrey “personal probability” is definitely a legitimate language to describe learning, but ultimately I believe that fundamental randomness is real and not just representative of lack of knowledge.

Like myself, you have spent the majority of your career in the private sector, building quantitative models for a variety of purposes. Do you miss teaching? What advice would you give to students who are contemplating paths in research, and aren’t sure whether academia or industry is in their future?

I really enjoy mentoring people, sharing my knowledge and understanding, and talking about the conceptual issues that drive my work. In fact, I think it’s the thing I like the most, although I’m still very much a victim of that sugar rush you get on a good P/L day or when an analysis just works out just right and you see through the fog. I think I would like to find a way back into a more formal academic role, but I’m not sure how to pull that off at the moment, sitting in my basement in a pandemic.

I think many people who pick up your book will be interested in your time working with the legendary Peter Muller, leader of the process-driven trading group at Morgan Stanley. You describe him in less than glowing terms, however. What do you attribute the success of this group to?

The success of the group is undoubtedly due to Peter, and his knowledge and understanding of the business of the scientific analysis of securities prices. Peter’s division of the function into alpha-building and trading strategy clearly added a lot of value when many people were “optimizing” strategies holistically without any idea of what was actually going on.

My experience of him as a manager was that he viewed too many interactions as if they were Poker games to win or lose. Now, this was immediately before he had his famous “sabbatical” and I haven’t spoken to him once since 2000, so my memories may not be representative of where he is now in his adventure. It’s also fair to say that I, personally, also had a lot of growing up to do then.

I do remember one very good interaction with Peter when we spent time together in my office with him showing me some of his tricks for working with cross-sectional data, which was exactly the sort of mentoring I was looking for at that stage in my career, but I also remember those infamous “strategy review” meetings in which he would not accept any alternative thinking at all.

In your book, you also describe interactions with Jim Simons, and provide a “sketch” to the reader (I’m being literal actually) of the methodology he may or may not have been leaning on. If we are to believe that inference of latent market states was an important ingredient, to what extent has that been arbitraged away by the explosion of convenient tooling in that domain? Put another way, what’s your prior on the breakdown of RenTech’s edge: between process, data and modeling?

Gregory Zuckerman’s pretty clear (in The Man Who Solved the Market) that a lot of Jim’s early work in finance was based on Hidden Markov Models, and that’s also exactly the sort of thing that was used at the NSA to break codes and, I understand, by Bob Mercer when he was working on language at IBM. Jim told me personally that he was initially successful trading FX “but not in the right way,” which I took to mean it was doing the sort of “magical incantations” of technical trading rules and similar stuff rather than alpha modeling.

I think two specific “process” things that RenTech did right, and others didn’t, was putting all of the alphas together into a single model (a Bayesian exercise), as is described in Zuckerman’s book, and trading in a profit maximizing manner with trades sized to the point where the marginal cost of trading equaled the alpha, thereby extracting all possible value. This latter point is based on a conversation I had with Henry Laufer in 1999.

Perhaps that is a segue into interest rate modeling. You look at 3 month Treasury bills and a hidden Markov model for neutral, bearish and bullish movements. Your data runs to the present day, suggesting there is plenty of gold to mine there.

I still believe there is exploitable alpha in rates, but it can be lethal because of the direct exposure to factor risk and the huge tails in the return distributions that follow from that. In fact, the strategy I have most recently implemented is one in Eurodollar Futures, which I described to friends as “putting the band back together.”

I think it’s pretty clear once you start looking at thirty-plus years of data that the model you use has to be non-stationary. A thing you and I talked about at JP Morgan was how to combine the non-stationarity provided by things like Kalman Filters while also retaining the true fundamentally leptokurtotic nature of the underlying price processes. I think I have recently “threaded the needle” on that one but how that was done is going to remain “subscribers only” for a while.

Let’s turn to elections. Your analysis indicates we might be headed for a very short president sometime soon, and perhaps one in poor health. Or did I misread?

Yeah, I think that’s the wrong way around. The data favours height differences, with taller presidents getting elected. I wanted to interpret this in terms of public perceptions of candidate health, but I reckon the real reason is buried further into our reptilian cortex than that. Incidentally, this piece of work was inspired by The Myth of the Rational Voter by Bryan Kaplan, which is one of my favourite books.

Speaking of election forecasts, I couldn’t help but notice that at least two very prominent election models, one from Nate Silver and the other the Economist, had significantly worse Brier scores than the prediction markets. Andrew Gelman is of course a very fine statistician and advised on the Economist model. Are they adding value, and are there lessons from finance that might be applied to this area?

There are many issues with how human beings process probabilistic information, as you well know from the work you’ve done on longshot biases at the racetrack. I was on a panel at Columbia which included people like the editor of Heard on the Street and other journalists in 2017. They all espoused the “models don’t work because Trump got elected” point of view. We forget that unlikely things do actually happen and 4:1 shots win horse races almost every day of the week. So do 100:1’s from time-to-time. I would say that simple place betting is one of the most efficient markets I’ve ever studied.

Moving to Coronavirus, your book includes some geospatial analysis of corona spreading — and in particular in your home state of New Jersey. Do you think it is the case that non-spatial compartmental models still have a grip on the intelligence feeding policy? Did you receive any interest or inquiry from policymakers, and if not, what might be done to better harness ``dilettante epidemiology’’ from applied mathematicians such as yourself.

I didn’t receive any inquiries about my work, but I found the whole experience very satisfying intellectually (although occasionally terrifying based on some of the numbers I was outputting). It was quite fun to get back to looking at ordinary differential equations, which I hadn’t done for decades, and then to try and figure out how to transform them into discrete time models.

It does show that there’s a large community of quite skilled people out there who are willing to donate their time to causes in the public good should they be asked. My guess is that non-spatial compartment models are 100% of the modeling in use, but what should be done are large-scale agent-based simulations. All in all, I’m a believer in building models, and forecasting from them, while recognizing the limitations of such an exercise. But what else do you have?

Twitter sentiment analysis has also occupied your time. What are your impressions? It seems that the use of Twitter for forecasting fundamental quantities or official reported numbers is more promising than more direct attempts at alpha. Do you agree?

I don’t think social media is the right place to search for short-term alpha in liquid stocks. Having spent a bunch of time at Bloomberg thinking about the differences between traditional news and crowdsourced reporting, the clear advantage of social media is the breadth of coverage not the lack of latency. My current twitter project is based on nowcasting sentiment as a macro driver, and that seems to be working

You speak about the sell-side brain drain and I have similar nostalgia — in my first job I was surrounded by mathematics professors and string theorists. Not so much later on. I’m sure there are many reasons for this and in some ways, it is simply a numbers game, but how much of this is a function of the changing “ownership” of the broad range of activities we call data science? Or is it simply a function of sell-side leverage ratios?

I think post-2008 we’ve seen the ascent of the heavily capitalized balance sheet at the expense of the quants. The objectives of the large commercial banks seem to be number one in market share not necessarily to be the smartest guys in the room. If profitability is a function of the quantity of capital deployed more than the manner in which it is deployed then the brain trust doesn’t actually add any value.

In terms of “data science,” we’ve seen a successful rearguard action deployed by traditional I.T. types to brand business analytics as “engineering” and thereby maintain control over this new “technology” deployed into institutions.

Graham’s book is available for your Kindle on Amazon, on Apple Books, and on Google Books at the Google Play Store. The paperback can be ordered from Barnes & Noble, or you can pick up a signed copy from Graham’s website.