Using an eGPU to provide critical processing for AI research (and games) – Part 1: Research and Planning

This is a log of experiences and experimentation in moving from more traditional home computing –ATX cases, components, water cooling, and continual upgrades– to something a bit more modular in terms or GPU computing power. This guide probably isn’t for most people. It’s a collection of notes I took during the process, strung together in case they might help someone also looking to pack multiple power-use-cases into as small a format as possible.

[Note:] A later evolution should involve a similar down-sizing of a home storage appliance.

Objectives

An external GPU requires more setup, and -let’s face it- fiddling than getting a gaming laptop or a full PC case that can handle multi-PCIe slot GPUs. So why do it? A couple objectives had been bouncing around in my head that led me to this: – I need a system that can run compute-intensive and GPU-intensive tasks for long periods of time, e.g. machine learning, and training large language models – I need a light laptop for travel (i.e. I don’t want to carry around a 5+lb./2.5 kilo gaming laptop) – I want to be able to play recent games, but don’t need to be on the cutting edge of gaming – I want to reduce the overall space footprint for my computing devices.

In summary, I want my systems to be able to handle the more intensive tasks I plan to throw at them: Windows laptop for gaming and also travel, the stay–at-home system can perform long-running tasks such as AI model training, password cracking, and daily cron jobs.

Things I don’t care about: – being able to play games while traveling – document data diverging due to on multiple systems: I use a personal #NextCloud instance to keep my documents in sync.

Current State

I have a number of personal computing devices in my home lab for testing things and running different tasks, but they’re all aging a bit, so it is time to upgrade: – my Razer Blade 13 laptop is from 2016 – my main tower/gaming PC is from 2015 with an Nvidia GTX 1060 – an i5 NUC from 2020 (unused) – an i3 NUC from 2013 (unused) – A 6TB NAS with 4 aging 2TB drives from 2014 – Raspberry Pis and some other non-relevant computing devices

Configurations

With the objectives in mind, and realizing that my workload system would almost certainly run Linux, the two configurations for experimentation were: – Intel NUC with an eGPU – Lightweight laptopi (e.g. Dell XPS 13) with an eGPU

[Note:] The computing systems must support at least Thunderbolt3, though version 4 would be best for future-proofing.

Shows an Nvidia GTX 1060 in a Razer Core X Chroma eGPU enclosure Image: Original GTX 1060 GPU slotted in the Razer Core X Chroma enclosure

Background Research

Before starting on this endeavor, I did a lot of research to see how likely I’d be able to succeed. The two best sources I found was the eGPU.io site with many reviews and descriptions of how well specific configurations worked (or didn’t). They also have nice “best laptop for eGPU” and Best eGPU Enclosures matrices.

Nvidia drivers and Ubuntu

Installing Nvidia drivers under #Ubuntu is pretty straightforward these days, with a one-click install option built-in to the operating system itself. The user can choose between versions, and my research showed that most applications required either version 525 or 530. I installed 530.

eGPU information

The best two sources I found for information on configuring and using eGPUs were: – r/eGPU on reddit – their “so you’re thinking about an eGPU” guideegpu.io

Proof-of-concept

Having read a fair amount about the flakiness of certain #eGPU setups, I approached this project with a bit of caution. My older tower had a respectable, if aging, GTX 1060 6GB in it. Since I already had a recent Core i5 Intel NUC running Ubuntu and some test machine learning applications, so all I needed to fully test this was the enclosure. Researching the various enclosure options, I chose this one because: – the Razer Core X series appears to have some of the best out-of-the-box compatibility – I’ve been impressed with my aging Razer laptop, so I know they build quality components – The Chroma version has what is basically an USB hub in the back with 4 USB 3.x ports and an ethernet jack added to the plain Core X version My thinking was that this system could not only provide GPU, but also act as an easy dock-hub for my primary computers. This didn’t work out quite as I planned (more in the next post).

The included thunderbolt cable is connected from the NUC to the eGPU. Theoretically, the standard peripherals (keyboard, mouse, etc.) should be connected to the eGPU hub and everything will “just work”. However, in my testing, things worked best with the peripheral hub I use plugged into the NUC and only the #Thunderbolt cable plugged into the enclosure. In the spirit of IT troubleshooters everywhere: start by making the least amount of change and iterate from there.

Intel NUC on top of Razer Core X Chroma eGPU Image: Just the enclosure with a NUC on top.

Experience

The NUC was on Ubuntu 20.04. The drivers installed just fine, but the system just wouldn’t see the GPU. Doing some research, it looked like people were having better results with more recent versions of Ubuntu, so I did a quick sudo apt dist-upgrade and upgraded the system to 22.XX. The GPU worked! However, the advice I’d been given was to upgrade to 23.04, so I did that and still the system worked fine.