I Tried to Break a Million Dollar Computer – IBM Z16 Facility Tour!

– Just yank it? Ah, come on you bastard. Oh, wow. Now it's really mad. Whole thing's still running. Meet Z16, the latest member
of IBM's mainframe family. That's right. Not only do mainframes still exist, but powering high frequency
transactional industries like global banking and travel. But as of today, you can get a
brand spanking new mainframe, configured with up to 256 cores built on the latest seven-nanometer
node from Samsung and with up to 40 terabytes of memory. Very few outsiders have ever been invited to IBM's 100,000 square foot Z test lab. So we are gonna pack as much as we can into the very short time that
we get to spend on site today. Just like I pack our videos with sponsors. – Ridge Wallet, time to
ditch that bulky wallet. Ridge Wallet can hold up to 12 cards, comes with a cash strap or money clip, and is available in tons
of different colors.

Use the code "Linus" to save 10% and get free worldwide
shipping at the link below. (upbeat music) – The core pillar of Z
is right in the name. It's short for zero, like zero downtime. And their target is seven
nines of reliability or about three seconds of
downtime a year on average. And to achieve that, takes an architecture that's
built from the ground up for resiliency. Now, out of the hundreds of Z systems that they're beating the
heck out of here in the lab, there's just one Next Gen Z16 that they got all gussied up for us with the fancy covers and everything, only for us to immediately ignore them and start digging around in its guts.

Starting at the bottom,
I initially assumed that these heavy black chunguses
here were cooling lines. But as it turns out, Z16 does
not support being plumbed into a building wide cooling system. Instead, they opted for
a self-contained system where every rack that has a
compute drawer gets this pump and radiator unit down here at the bottom. And if you look closely, you can actually see the
two redundant cooling pumps, that's in case one fails. There's actually also an empty spot where it looks like one
of the pumps is missing. That's because in the validation phase, they weren't sure if these
pumps were gonna meet their reliability standards. So the plan was to just
tuck three of them in there, just in case. These are actually power lines then, routed through the three-foot crawl space that's all underneath us,
covering the entire test facility.

And each of these two is
capable of carrying 60 amps of 208 volt three-phase power
on this particular config. Power that needs to be distributed. That's where these come in. Each of these power distribution units or PDUs down the side here, are meant to be fed
from a separate breaker in the event of a power loss. And basically you can think of these as the enterprise grade
equivalent of a power strip. So using the onboard ethernet port, a technician can monitor power usage, turn off particular plugs for maintenance, or even update the firmware. Does your power strip
have upgradeable firmware? No? Gross. Next up, things get really spicy. Each Z16 system can be configured with up to four of these compute drawers. And each compute drawer
contains up to four of their new Telum chips. And these things are really cool. Let's go take a closer look. What a monster. Telum is a chip-like design and you can actually see
the two separate dyes here.

It's got support for DDR4
and a total of 16 cores and 64 lanes of PCI Express per socket. Which is neat, but that tells
only a fraction of the story. While a consumer, x86 CPU,
might contain dedicated hardware for let's say decoding
popular video codex, Telum brings some very
different specializations to the table. Each one of these cores,
so there's eight per dye, has a co-processor to
accelerate storage, crypto, and IBM's own compression algorithm. Then each dye gets its own
GZ compression accelerator. And this is one of their
biggest announcements, an AI accelerator. The reason for bringing that on dye, is that customers like
financial institutions, have these complex machine learning models that they've built for, for
example, fraud detection. But data scientists care about accuracy, not necessarily the
downstream performance impact.

So what they found out was that
they had these great models, but they didn't have the
performance to actually apply them to every transaction. Or well rather they could
apply them to every transaction if they had all the time in
the world, but they don't. So if your bank has an SLA
or a performance guarantee of nine milliseconds on a transaction, and there's not enough time to apply their fraud detection model, they either have to just let
it go, fraudulent or not, or decline it, risking the customer just
pulling out their next card in their wallet, effectively giving that
business to their competitor. By putting the AI processor on dye, they're giving it direct access to the transaction data that
is sitting in the CPU cash, rather than forcing it to
go out to memory for it.

And compared to last Gen Z, IBM is figuring on a 20 to 30X improvement in performance with, and this is critical, better
consistency and accuracy. The cash on this chip
is really special too. But to properly talk about that, we need to zoom out a little bit. Now we're looking at a
full compute drawer here. Each of which contains four Telum chips.

Actually, this one's only
got three in it so far. I'm assuming I have this ESD strap on, so I can install the last one, right? – [PJ] All you. – Is it this one?
– [PJ] Yep. – Wait, so you guys handed
me a working chip before? Oh, well, that was your first mistake. Okay. Where's the nifty install tool? Ah, yes. Okay, cool. So I've seen this demo once, which should be more than enough for me to perform it for you. The Telum chip goes in a
little something like that. Then this Doohickey (indistinct) lines up with the dots on the thing, then you grab the chip, right? You make sure it's not gonna come off over something soft-ish. Okay. Holds onto a little
something like that. We really hope the CPU doesn't come out. If it comes out in the
socket, this boy's done. It won't, right? (PJ laughs) I just have a bit of a reputation and I don't need it right now. I don't need the SAS from
my camera operator here. Then I poke this in here, I think. I poke in there and then
I squeeze to lock it.

Well, unlock it. And boom, it's in. We give a little wiggle. Heck yeah. Now we're halfway. The next thing we gotta
do is install our cooling. And this is super cool. One of the first questions I asked them when I walked up to
this compute drawer was, "What the heck kind of
thermal compound is that?" My initial gut feeling was that it looked like some
kind of liquid metal. Like, I mean, it's not
like it's not being used in commercial products. The PlayStation 5 is using liquid metal. But actually it's solid metal. So rather than being like an
Indian Galea mix of some sort, this is just an Indian thermal pad. And what's great is you can
actually see how, there you go, they've got the size just exactly right to cover where the dyes
are creating hotspots under the integrated heat spreader. So these are gonna be the interface between the integrated liquid
cooling system in the chassis, there we go, and our processors. And I guess, I can't even tell… Is the latch on? Kind of looks like it.

Where's my torque wrench or screwdriver. Oh, is this it?
– [PJ] That's it. – Look at that. It's all
configured for me and everything. Yeah, this feels like a lot. I mean, I guess it's a
lot of pins, but still. Oh, this really feels like a lot. Is that… Okay. Whoa. Putting that kind of pressure on that kind of spicy,
expensive stuff is stressful. I'm sure you guys are used to it though. This is adorable. Little 3D printed hose holders. Do you know how many times
I've been working on a system and I wished I had one of these to hold a block out of the
way while I'm doing stuff? I love it. I don't like this screwdriver. We're gonna have to make a
better torque screwdriver for you guys. We're working on our own
screwdriver, lttstore.com. Yeah, he knew I was gonna
say it. Bloody hell. That looks freaking awesome. Like that is hardware porn right
there if I've ever seen it. This four CPU config gives
us a total of 64 cores in this drawer and a whopping 256 PCI
Express Gen four lanes.

But you might have noticed, the motherboard that they're installed on is really unusual looking. Where are the memories slots? Where are the VRMs? Let's answer the second one first. This is called a POL card
or a point-of-load card. They haven't showed me how to take it out, but I'm sure I can figure it out. Ah, yes, there we go. These are super cool. These take the bulk 12 volt power that comes from the power
supplies over on the end here and step it down to the
approximately one volt that we need for the CPUs and the DRAM modules. And the craziest thing about these is that they've got 14 power phases each and they can be dynamically
assigned depending on where they are needed, with two of them actually
being completely redundant. So these three do all of the work, and these two, including one
of the ones that I took out, are doing absolutely nothing
unless something fails.

I love seeing that old school IBM ass logo on like cutting edge (beep). Total combined output
not to exceed 660 amps. Okay, I'll be sure not to. Now, they get some additional help from… Ah, yes. Here they are. Come on out little buddy. There we go. This is called a voltage
regulator stick or a VRS. It's smaller with fewer phases, but it contributes to
step down for IO cards and things like that. And then… Oh, one other really cool
thing I wanna show you guys is the oscillator card
in the front of the… Oh, I think this thing is locked. There we go. I'm moving it. In the front of the compute drawer here. We talked in more detail
about what cards like this do in our timecard video.

But essentially, they
take an external signal and ensure that all the
machines within a data center are running at exactly the same time to avoid wasting clock cycles and operate more efficiently. Now let's move on to memory. And this is where things get really funky 'cause I ain't never seen a memory module looks
anything like this thing. There's at least three
crazy things going on here. Number one, is that even
though this is DDR4, some of the power delivery is
actually on the module itself, like we've seen with DDR5. Second up is the IC configuration. We've got 10 of these DRAM chips
per side for a total of 40, which doesn't really correspond with any kind of ECC error correction that I've ever seen in my life. And it turns out that that's because it's
not done on the module. Each memory controller in the system, actually addresses eight
of these memory modules and the ECC is handled more like RAID, with the parody data striped across the eight different modules. It's all managed at the
memory controller level.

Finally, we've got this
bad boy right here. What's this thing called again,
with the copper chip set? – [PJ] Explorer. – It's the Explorer. This is a proprietary buffer ship that basically adds to latency hit, but allows the memory controller to address vast amounts of memory with each of these systems
able to handle 10 terabytes of DDR4 memory. Absolutely flipping insane. Papra, pa, pa, pa, pa. Now let's talk about why I
needed all four of these chips to explain Telum's cash configuration? Where a normal CPU would have a tiny, lightning fast level
one cash sitting right next to the processing cores, followed by a larger slower level two and then a largerer slower level three and so on and so forth. Telum has an already aggressive
by-consumer standards, 256 kilobyte, private
level one cash per core.

So they've got eight of
those per dye, 16 per socket. That is then backed up by
a whopping 32 megabytes of level two cash per core. To give you some idea of what
this means in practical terms. Look at this dye shot. This thing has more cash
than compute by area. And it does a away with
level three cash altogether out the window. Now IBM's engineers
probably could have said, "Well, the level two cash is so big. We probably won't need
level three anyway." Fair enough. But they didn't. Instead, when a line needs to be evicted from one of the core's
private level two cashes, it will actually look for empty space in another core's level two, and then mark that as
virtual level three cash. Now for that core to then fetch it from another core's cash, does add latency compared
to if it had been in its own level two
right next to the core, but that would be true of a
shared level three cash anyway.

And this approach affords them an incredible degree of flexibility, particularly for cloud customers who might have very
different hardware needs from one to the next. And it gets even crazier. Evicted lines can also
be pushed off of the chip to another CPU altogether
in the Z16 system, allowing a single core to
conceivably have access to up to two gigabytes of
virtual level four cash from the other CPUs in the drawer, and up to eight gigabytes if we go beyond. Is this the beyond? I suppose I believe I died
and went to hardware heaven. To my right is a fully kitted out Z16. And now we can both continue and actually expand on the
tour that we started before with that single rack unit. This one has all four possible
compute drawers populated. You can see there's one here and there's actually three
in the second rack here. And in order to connect them, IBM uses these super
cool SMP-9 active cables.

These things are super badass. They've actually got heat sinks on them. And inside the sheath, you'll find 10 plus one
fiber for redundancy, with each fiber carrying
25 gigabit per second for a total of 250 gigabit per second. Oh, and of course, all of
the links are redundant. Two of them between each
one of the compute drawers. Now over these links, the system allows the CPUs
in the separate drawers to share that level two
cash I talked about before as a virtual level four. Which can apparently in some
cases, actually be faster to draw off of this CPU's cash over SMP-9, versus pulling off of your own DRAM in your own compute drawer. It's insane. I'm just getting permission to pull a coupling link card out, but I realized we didn't even
talk about the reservoir. How sick is this? Just wanna fill up your
mainframe. No big deal. Got your like giant… Looks like stainless
steel res going on here.

Absolutely gorgeous. I'm 104! – [PJ] '04! – [Chris] Okay! – Just data center things! How's it going? – [Chris] Good. – Awesome. Look at these chunky quick connects, man. That is sick. Do you want me to pull the collar back. It'll be fine, it'll be fine. You'd have to actually link them together to let the liquid flow. Hey, the lights on, look at this. And one, and…

I don't know. Sure, like that. Hey, there we go suckers. This is a coupling link card. It's using multi-mode fiber,
12 lanes per four redundant for short range, high bandwidth linking of multiple Z systems
within the data center. And it uses PCI Express. The idea here is that, depending on the customer's
resiliency strategy, which is often government mandated, they could have a whole
web of Z systems configured for high availability, which could allow an
entire drawer to go down with immediate failover and no data loss. That's why these plugged directly
into the compute drawers. For less bandwidth intensive IO devices, we rely on these. You can fit so much PCI and this bad boy.

Code name "Hemlock" makes
them sound a little sexier than they are, but they're still pretty cool. Using these direct attached copper cables. Actually… Ah, cool. I've got one of these right here. They have a 16X PCI link
to the compute drawer. And then you can stack these puppies up with a ton of IO cards. NVMe storage, crypto
accelerators, network cards. I mean, you can actually
run Linux on these things. So pretty much whatever ad in
board the customer desires, can go in one of these. And the PCI Express link between the hemlock right
here and the compute drawer, goes a little something like that. Pretty cool. Now, you guys might have noticed that only the center two
racks contain compute while the outer ones are all IO.

In fact, this one over here on the right is just an entire tower of IO. That's normal. And most of the cards that
you're looking at in here are either FICON or CICSPlex cards. And these work together
to handle everything from connecting to storage
racks within the data center to offsite synchronization and backup. They can go unboosted up to 10 kilometers and 100 kilometers with boosters. But in order to test that, we're gonna have to
take a little field trip It's not for hairstyle. It's got a mainframe up
my (indistinct) here. Welcome to the patch room. In here, you will find
50,000 ethernet connections alongside 200,000 IO connections. The blue ones are OM3 multi-mode fiber for carrying PCI Express
links across the data center. Those are the ones we saw before. And often these would
end up at an IBM DS8K, which is their enterprise class storage. While these ones are almost all FICON. And this is super cool. If normal fiber channel drops a packet, the origin is gonna retransmit
it to the destination. "No big deal," says I. "No," says IBM. "That is a big deal." So FICON actually contains enough storage on each adapter in the link that only the last hop of the journey needs to be retransmitted.

This apparently has both performance and security implications. Sounds expensive. Then over… Oh crap, I lost it. There it is. Check this out. Inside this box is 50
kilometers of real actual fiber that they can combine with
another 50 kilometer roll, you can actually see it right there, to test out their offsite capabilities under real world conditions. And while we're over
here, this is super cool. Each of these racks is set up
kind of like a site A, site B. And what these do is they
can take hundreds of channels of 13-10 fiber, convert them
to different wavelengths, transmit them hundreds of
kilometers away or whatever, then split them all back
out on the other side, like with a prism, into your separate channels. Freaking bananas, right? Then there's the brains of the operation. The support element, or as
I call it, The My Wife Unit. Although because it's Z,
there's two redundant ones. And I don't know how much she'd like that.

The support element acts
as a management interface for the system, making sure that everything
is running smoothly by monitoring this intra
rack ethernet network that is checking temperatures
and functionality and reporting on failure events. Bringing us to the big question. What does it actually do? Well, the biggest draw
for IBM's Z customers is reliability and lightning
fast, accurate processing of transactional data. That's what it does way
better than NextCity Six. And Z16 aims to improve over last Gen by introducing tools that
will help it do all of that, and faster, thanks to a
machine learning boost, as well as adding new tools that will help come customers identify and streamline security
and compliance obstacles in the organization. Sounds fancy. So I'm sure you're wondering
how much one of these costs. Well, the truth is that the
number and the title there, assuming they haven't changed it by now, is more of a guesstimate.

A bear cage actually starts in
the neighborhood of $250,000. But every single system is custom built to the spec of the purchaser. And the IBM folks here laughed and said, "Well, by the time you've gone
all the hardware and software to actually run it, I'd say we're pretty comfortable with that million dollar
number as an approximate." So you won't be having any of these in your basement lab anytime soon. But what you might have is FreshBooks. Thanks FreshBooks for
sponsoring this video. Now I'm gonna go ahead and guess that you're not an accountant, which is why you're
gonna love this software. FreshBooks is built for freelancers and small business owners who don't have time to waste
on invoicing, accounting, and payment processing. In fact, FreshBooks users a
can save up to 11 hours a week by streamlining and
automating PES admin tasks like time tracking, following up on invoices,
and expense tracking, with features like their new digital bills and receipt scanner. Over 24 million people
have used FreshBooks and love it for its intuitive
dashboard and reports. It's easier to see at glance, exactly where your business stands, and it's even easier to turn everything over to your accountant come tax season.

94% of FreshBooks users
say it's super easy to get up and running. And with award-winning
support, you're never alone. So try FreshBooks for free for 30 days. No credit card required by
going to freshbooks.com/linus. Huh? My voice is shot. So if you enjoyed this video, maybe go check out our tour
of SFU's super computer. It was super cool. Feel like this is like a
scene out of "Gremlins." Don't get it wet. (laughs).

As found on YouTube

You May Also Like