kentoh - Fotolia

Guest Post

Using the New EDSFF (E3) SSDs Effectively

The new EDSFF form factors allow higher system density and more effective forms for high-performance racks. Here is an overview of the specifications and status of EDSFF E3.x and insights on effective use.

00:02 Cameron Brett: Hello, my name's Cameron Brett. I'm with Kioxia America, and I'm hosting this panel discussion on using EDSFF E3.x SSDs effectively. With me today are three industry veterans developing standards for storage for data centers. Our panelists today are Bill Lynn with Dell EMC, Paul Kaler with Hewlett Packard Enterprise, and John Geldman with Kioxia America. So, let's go ahead and kick it off. John, can you give us an overview of the specifications and status of EDSFF E3.x?

00:37 John Geldman: Sure, be happy to. Right now, the E3 specification, otherwise officially known as SFF-TA-1008, is in the middle of a reboot or at least it is when I'm recording this. By the time you're seeing this, it may be an approved specification. To backtrack, it was an approved specification, but we've decided to make some changes to make it better, and we're going to obsolete the previous one. So, the only one for you guys to pay attention to is the new one. Think of it as a reboot.

One of the major capabilities that we're doing is to make it more compatible with the OCP NIC, so the slots could be used for both types of devices. Also, we're changing it so that you can design E1.S hardware that will fit into the E3 slot carriers. So, this is kind of a fun reuse capability for SSD makers. And when we're talking about low capacity, 4 to 8 terabytes -- it's fun to say that -- you might be able to get away and not have to redo that board for us. It might end up being less work to support both form factors.

02:00 JG: So, we're seeing a lot of standardization in a variety of areas to migrate HDDs into systems that support PCIe-based EDSFF, and this potentially is one of the form factors that reaches into the future for them. It's kind of the obvious option to replace the 2.5-inch, and we're also looking at . . . Well, there are multiple organizations that are involved in this. There's work going on in NVMe and there's work going on at OCP. So, the major specs involved for E1.S form factor are the SFF-TA-1006, that gets you the small candy bar shape. You have the E1.L, which is SFF-TA-1007. That's the long one, sometimes called the . . . used to be called the ruler when it was first introduced. And then we have the E3 specs, which are both under SFF-TA-1008. All three of these specs share the SFF-TA-1009 PCIe mapping, which is also currently going through some clarifications, not a complete rebuild.

03:24 CB: OK, so thanks for the background on that, John. So, Paul, could you talk maybe a little bit about the difference between E1 and E3?

03:34 Paul Kaler: Sure, yeah, thanks, Cam. So, talking about kind of the . . . I'll start up here at the top here with the E1.S SFF-TA-1006 that John mentioned that is a much smaller form factor. It's very similar to M.2 There are five different variations of it. Some are very similar to M.2 They're uncased. Some have cases that allows them to support higher capacity along with higher power for some better performance metrics. There's a range of different applications that E1.S could work, especially we see the edge is kind of very small density optimized servers where you might be able to put in a very small space, a decent number of drives to get to the performance that you're looking for.

As I mentioned, there's five different variations, and a couple of the case versions that we have listed here. The 9.5 millimeter is a general-purpose use case. The 15 millimeter kind of gives you some additional performance benefits with the heat sink attached, and the 25 millimeter allows you to go to even lower air flow requirements at the same power. So, again, they are targeting different use cases enabling different types of power and performance profiles that a customer might need.

04:50 PK: And then we get to the E1.L. This is primarily interest in hyperscale market for very specific storage applications, really targeting some of the QLC high-density applications. And we get to E3 and see some of the differences here. Really, the biggest difference is, if you just look at the two form factors, is size. So, you'll see later in the presentation, we'll be able to go through what the E3.S sizes look like, but it's very similar to a 2.5-inch form factor drive. It allows you to scale all the way from a x4 up to a x16 interface, whereas E1 limits out at x8 interface, as well as some lower power profiles that the E3 really allows you to go to a much larger power profile.

So, that will take care of some of these application use cases that we've seen in the enterprise market. Things like storage class memory and future CXL devices that are going to be more power hungry, going to PCIe Gen 5, power requirements drive up to 35-plus watt range, and so those where we see accelerators for computational storage and also PCIe Gen 5 and beyond enabled with the E3 form factor.

06:00 PK: You just have more surface area, more ability to support larger ASICs inside the device to support multiple device types. And then we have a couple of thickness options that you'll see also later, so that enables denser storage designs where you can have much more NAND capacity inside the device with our 2T thickness. And then coming to the last row here, the E3.L, which is the long version of the E3 Short, and that's where we see where you need even more capacity. So, there's some interest there in the storage environment where you really want to target very high-density applications, that also enables you to get to even higher power profiles, so you can potentially deliver different storage class memory or other high-power device, high performance in that form factor.

06:51 CB: OK. Thanks a lot, Paul. So, with any new specification or initiative, you always have to be solving some sort of problems. And, Bill, if you can take us a little bit through the improvements that could be made to the current situation.

07:08 Bill Lynn: Yes. Thanks, Cam. So, basically, when we started out with E3, what we were looking to do was find something that was really optimized for NAND flash memory. So, with E3, we wanted a form factor that had enough surface area on the form factor to really pack in a sufficient amount of memory to make it worthwhile. We wanted to be able to maximize storage density. For servers, everything's getting hotter and faster, so we really needed something that we could help improve our thermal and cooling abilities within the servers, and we're looking for something that is not limited to SSDs or storage. If you're looking at servers today, we're starting to see demands for front-facing I/O, for alternate device types like storage class memory, accelerators, TPUs. So, we were looking for something that could work for a variety of device types. And to do a variety of device types, we really needed to be able to support a wider PCIe link width. So, with E3, like Paul mentioned, we can support x4, x8 and x16, and if you really want to take advantage of these higher link widths, you need a wider power range. So, if you want to be able to saturate a x8 PCIe lane, you really need something beyond 25 watts. So, we're looking at going to 35, 40, up to 70 watts, possibly beyond that with future generations.

09:00 BL: The other thing that was a major driving point was we needed something that was better for signal integrity. The old SFF-8639 connector that's on the current U.2, this is really reaching the end of its life. It's been around for four generations, and it's got a mix of pen type, so it's really not optimal for high-speed signaling. So, we went to a totally new connector that's defined by SFF-TA-1002, which is really aimed at PCIe Gen 5 and Gen 6. So, those are all of the factors that came into the E3 considerations.

09:44 CB: OK, Bill, maybe you can step us a little bit through some of the imagery here . . .

09:50 BL: Sure.

09:51 CB: New form factors.

09:53 BL: Sure. So, E3 is really a family of multiple device types. Within the specification, we define an E3.S, which is the E3 Short, and an E3.L, which is the E3 Long. And within that, we also define multiple width devices. So, we have an E3.S, which is that first green device, and then we have an E3.S 2T, which is a double thick version. And if you look at it, it's very similar to a 2.5-inch drive that's in use today. Then we have the E3.L, which is the single width. These are 7.5-millimeter devices, and then we have the E3.L 2T, which is the double thickness at 15 millimeters. So, the E3.L 2T is roughly the same size as a current 3.5-inch drive if you want to think about it that way. And as you step up in size and capability, like you can see off to the right there, you can support x4, x8 or x16 connections.

11:08 BL: That's all defined by SFF-TA-1002, which is the connector definition, and you can support various power profiles. The E3 Short or E3.S is in the 20 to 25 watt range. The double thickness, we're looking at 35 to 40 watts in there. The E3.L, we're looking at 40 watts, and E3.L 2T, the double thick, 70 to 80 watts, somewhere in that range. And we may actually be able to go to higher power profiles on that because we're looking at creating a new variant of the SFF-TA-1002 connector that has an additional power tab. And the beauty of all of these since they all work together, you can take a thick device and plug it into two thin slots. You can take a short device, plug it into a long slot. So, they're all interchangeable, which makes it very easy from a server point of view to define a common mechanic set that allows you to swap different types of devices.

12:22 CB: Building blocks.

12:24 BL: Yes.

12:25 CB: Yeah, so you said that they're similar to 2.5- and 3.5-inch drives today, but it sounds like they're much more optimized for flash storage and other devices.

12:36 BL: Yes.

12:37 CB: OK. Well, thanks for the overview on the devices themselves, so you'll obviously need something to plug these into. So, Paul, do you think maybe you can touch on how these form factors improve chassis design?

12:54 PK: Yeah, I'd be happy to. Yeah, so as Bill mentioned, you kind of now see how these different form factors get implemented, a couple of different variations. One of the biggest things that we really did with this was enabling that two and one kind of capability where the airflow between the devices and the thickness of the devices were all designed so that you can support this capability that you see here in a 1U server, where you could have 20 devices supported that are the thin devices. And then you could also have different configuration, so that'd be a very dense configuration where you maybe don't have to support a huge amount of downstream devices that are high power because clearly you'll have quite a bit of wattage up front.

13:44 PK: And then the second configuration here you can see is where you have a much higher power configuration, so you might cut down on the total number of drives. So, you have a lot of airflow, a lot of cool airflow that is going directly back to cool off your CPUs and memory and GPUs and other power-hungry things you have in servers. And then you can see really this last one here gets into the heart of the two for one split capability, and sort of the ability to have a lot of combinations of different device types. And so, it really is kind of a building block and, I think Cam, that was a great definition and description of what we can do here with it. And so, you can see here where you have different device types that might be much higher power, like the storage class memory or the accelerator type devices.

14:28 PK: And you can decide that you would want to swap out, if you look at the top configuration where you have four of the thin devices, you can swap those out with two of the thick devices. And you can really mix and match any combination that you need in a 1U server, so that you can really dial in the right amount of performance, the right amount of storage density, the right device types. So, that really is very flexible and enables you to, within one chassis and one server, you can really have a wide range of device types supported to really dial in the right amount of performance and density that you need.

15:04 CB: Well, it sounds like it gives a lot of flexibility and opens up new types of architectures. Maybe touch on a little bit more some of the design advantages.

15:15 PK: Yeah, so we touched on this a little bit, but really kind of driving home the point of some of the power profiles that this enables. I think, you know, we talked a little bit about supporting in the form factor up to 70 watts, that's enabled because we have this great shared connector across the entire family of EDSFF, including E1 actually. And so, the range of capabilities that E3 specifically brings by supporting a x4 or by x8, and all the way up to a x16, is that we have this capability to source up to 80 watts of maximum power across that same connector. And I think we'll show in a little bit, Bill mentioned that we are even . . . We have a capability to add an additional power tab that could take us up in the future. So, we really kind of future-proofed around this form factor, so that once we make a transition in the server world, and form factor transitions are always very difficult and challenging, so we want to make sure you don't have to do that very often. So, we really try, with E3, we really tried to make sure that we built in a really long life and capability there to support once we go to this form factor, it'll be there for a long time for us.

16:24 PK: So, clearly, all of that additional power enables us to support different device types, and so those high-power accelerators that we talked about and computational storage, there's a lot of the things there that we think we'll be able to enable in the future, as well as some of the things, like OCP NIC designs, that could be in this EDSFF form factor in the E3 thick form factor, the 2T.

So, clearly one of the benefits to higher profiles, even for a standard NVMe device, is supporting higher throughput and that comes with PCIe Gen 5, that's just around the corner, and, of course, we're looking at PCIe Gen 6 and beyond. So, this will enable that for standard NVMe devices, in addition to the other device types we've already mentioned, we are actually working on, also related to power profiles on a power and thermal specification, that's SFF-TA-1023.

17:21 PK: And so, that's going to help guide device suppliers to characterize their devices so that system implementers can be able to compare those and make sure that we have the right fans and thermal cooling capabilities for all the different device types that'll be out there. We've talked a little bit about the wide range of device types that E3 enables, and so that's another thing that we think will, over time, we'll be able to have more and more device types supported in this form factor, which is great because we won't have to have a lot of different form factor types. We can have a chassis that supports the E3 form factor and with that have lots of different devices plugged into it, without having to go and have unique drive cages or device cages for specific device types.

18:05 PK: Going to the 2X . . . The 2T device thickness that we've been talking about, the thicker 16.8 millimeter that enables a lot of the standard network interfaces. So, again, for these other device types where you have a front plug and a network cable that you'd go in, you need that additional thickness to support that, so we have the capability there with the 2T thickness device. And, of course, the larger surface area of the device itself and the circuit boards inside allow for a much larger NAND capacity points, as well as supporting a larger ASICs that some of these devices might require. Some of these ASICs are over 30 millimeters, just the size itself, so all of that gets enabled as well. And, as Bill touched on, we have this great interoperability between the form factors. You can take a short device and plug it into a long bay, you can take two thin devices out and plug in a thick device in its place.

18:58 PK: And so, you're really able to get a good mix and balance of the types of devices you need, as you saw previously in the server configurations, to the type of requirements that you need to deploy. You can really find the right mix and match of the device types as well as your power profiles to what kind of bandwidth you want to deliver, all of that is very scalable. And then you also can have common spares, which is a key thing we want to be able to enable between our 1U and 2U chassis. We have a really good form factor that works well between both of those, and so if you have your E3 devices, you don't have to have a different custom form factor for a 1U server -- you can enable that across both your 1U and 2U data center, so that you have just common spares across those types of chassis.

19:48 CB: OK, thanks a lot Paul. So, John, maybe you can elaborate a little bit more on some of these future device types that we might see for E3?

20:00 JG: Sure. E3, the new reboot of E3 introduces a couple of new things. It lines up with the OCP pins, which are . . . I can't point to anything, sorry. It lines up with the OCP . . . edge card connectors. That's the words I'm looking for. I have to do that a bit live. And it also includes the capability to add in the future, the extra power, which OCP enables in a form factor. This really helps us mix and match both our functionality types, as well as additional functionality types into this form factor.

20:55 CB: OK, and anything else to add, Bill or Paul, on this?

21:04 BL: Oh yeah. Like I said, this allows us to do front-facing I/O. So, we can take essentially a NIC and plug it into one of these slots and bring I/O out the front. By moving the connector, we aligned ourselves with the OCP NIC 3.0. And it turns out that the OCP NIC 3.0 uses the SSF-TA-1002 connector, so we were able to align there. And it's a really good illustration of multiple standards groups working together to a common cause.

21:40 CB: OK. Well, thanks a lot. John, maybe you can just kind of . . .

[pause]

21:57 JG: OK

21:58 CB: All right. Well, maybe we'll get back to that slide in a little bit. OK, so . . . We seem to have lost Paul. So, we do have a couple of additional reference slides that we can go to, but we definitely want to open it up to questions from the audience. And if people are still getting their questions ready, we have a couple of things that we can elaborate on while we're waiting. So, I guess we'll see if Marty has anything teed up for us. OK . . .

22:43 JG: Looks like no open questions.

22:45 CB: Yeah. I'm sorry, John, why was E3 developed separately from E1 when they are both under this EDSFF umbrella? Do you have any background on that?

23:04 JG: So, the three form factors came from three different spaces. And by three form factors, I mean the E1.S, the E1.L, and the E3 family. In one way, all of those form factors are one family. And then the family we've been talking about where there are multiple elements of it are the E3. And each of them has a slightly different focus to how you go build the server behind it and is a different organizational structure. Some of them are better for some use cases than others. E3 is a general jack-of-all-trades in its functionality.

23:47 CB: OK.

23:49 BL: Yeah. If you think about it, E3 was really aimed at enterprise servers, the 1U, 2U servers. The E1 Long really doesn't work well in a server because it is so long. It's a foot long; it breaks a lot of chassis constraints from a general-purpose server, but it is a very, very dense form factor. So, it works really well in storage applications where you're building a JBOF or something that is just crammed full of storage. The E1 sort of came at us from an M.2 direction. It's a small, general . . . I won't say general purpose, but it's a very small form factor and it's really suited to sub-1U and below form factors. It's great for the edge and places where you don't have very much room. So, it's really three different use cases that drove the three different form factors.

25:03 CB: OK, thanks for the background on that. Actually, there is . . . I do have a diagram in one of the slides from another presentation . . .

25:18 JG: But let me throw in a little something about why we rebooted in the first place. So, E3 had this goal which Bill just talked about, but what it didn't do was coexist with the front-facing I/Os like the OCP NIC 3.0. It didn't allow reuse of E1.S boards within the card structure. So, we wanted to accomplish both goals and maintain the original goal, and that's how we ended up with the reboot. But one of the big penalties of that was that the old . . . For those of us who implemented proof-of-concept devices, throw those bad boys away. We now have new ones.

26:15 CB: OK, good. All right, let's see, I think I got one for Bill here. Now, we talked a bit about some of the 1U chassis options, but I wanted to pull up something to see if you can elaborate on . . . Some 2U configurations. I know that there's a lot of different possibilities that you could have with 2U.

26:48 BL: Sure. So, one of the things that Paul pointed out is, if you think about server designs, everything's getting hotter, faster, more power, everything else. And if you look at that top diagram, that shows 44 E3 devices in a 2U. Mechanically, you can actually fit like 46 but . . . In this case, we have 44. That configuration will probably never be productized, for a couple of reasons. One, given that CPU TDPs are going north of 400 watts, and the fact that DDR 5 doubles the memory power, I have to treat air as a device now. I have to physically leave room for air to pass through the front end. So, I can't stuff the entire front end of a server completely full of storage anymore because I completely starve everything behind it. So now, given E3 and E3 Long, sort of the rule of thumb right now is that an E3 Short is roughly half the capacity of a U.2 and an E3 Long is roughly equal to a U.2.

28:20 BL: That's a rule of thumb right now. So, some people say, "Well, why would you want to build a device that is only half of the capacity of its predecessor?" Well, like John pointed out, small devices are 8 terabytes now. That's hard to wrap your head around. You look at U.2 and you have 16 terabyte devices that are going to 30 terabytes. And the failure domain just becomes so large that if you lose a single device, you're losing this gargantuan chunk of storage. So, in a sense, it's actually better to go to a smaller device and break up that failure domain.

29:09 BL: It also allows you to get higher performance because you've got more devices out there to stripe all your data across, so you can get it in and out of the devices faster. So, that's one reason for going to a smaller device. If you go to the E3 Long that has the equivalent capacity of a U.2, you can actually pack almost two times the number of devices into a 2U server. So, that gives you almost twice the amount of storage, plus twice the amount of performance because you have more lanes. The problem is a 44 lanes by four each or 44 devices at four lanes each, that's a lot of PCIe lanes.

29:55 CB: Well, that is also a good argument for a higher capacity drives.

30:00 BL: Yes.

30:01 CB: So, speaking . . . So, I have something here, open to both of you, but . . . So, what are common or maybe maximum capacities for a E3S 1T . . . No, I guess... 2T and then an E3L, an E3L 2T. Are there any capacity sizes? And, of course, then you can extrapolate. How would that look in one of these 2U chassis configurations?

30:33 BL: Well, I'm a server guy, so I'm probably not the best guy to make a comment on what the device capacity is going to be. That's more in your camp, being from Kioxia, but . . . for us, it's . . . Well, actually, John, why don't you take that one?

30:57 JG: OK. And I'll admit, I'm a standards guy, not a product guy. So, what I will say, is that the demand for really large devices isn't really calling out yet. So, we can build devices that nobody wants yet. It is just too expensive. As time goes on . . . And I joke about the 8 terabytes being small, and that's a fun joke, but that was unthinkable two years ago. So, in two years, we were going to be at a whole other capacity infraction. But one of the reason that we have the thins and the thicks and the shorts and the longs is so that you can support different types of memory other than NAND. So, if you want to do storage class memory, as is shown in the bottom, left right now, you can actually get enough of that stuff on there to be useful. Whereas you can get your 8 terabytes and more in your E3 Short 10 today.

32:14 BL: That's a really good point. From Dell's point of view, we view the thin devices, the E3 Short thin and the E3 Long thin as the primary form factors for NAND-based storage. We really don't see a thick device or a 2T device being NAND-based storage. We see that as being more the I/O, the accelerator-type devices, storage class memory and . . . Even though that, the picture shows SEM as a monolithic block, if the SEM needs a fair amount of power, 50, 40, 50 watts, a big chunk of that . . . That volume could actually be fins on the device so that we can properly cool it. So, we see the thicker devices for the other device types, not necessarily for NAND-based storage.

33:16 JG: Accelerators will have the same cooling issue . . .

33:21 BL: Yes, exactly.

33:24 CB: I think we're going over a little bit, but I just had maybe two . . .

33:29 JG: How about if I add one thing in here, which is, to get to all the fan specs we've been talking about SFF-TA-1002, which is the connector; SFF-TA-1008, which is the form factor; and SFF-TA-1009, which is the PCI mapping for all of the EDSFF family. Go to SNIA SFF, and those are available, free of charge, for anybody to download. Those are all mostly social sites, the actual good stuff, the specs, those are at SNIA SFF.

34:14 BL: And just a public service announcement, SFF-TA-1008 Rev 2.0 is a formally published and ratified specification as of last Friday.

34:28 JG: My prediction did come true. How odd.

34:34 CB: I do see a question that did come in. "Is E3.S a front-loaded card only?"

34:41 BL: No, it can go in the rear.

34:44 CB: OK.

34:46 JG: And if you want to build it in coming in from the top and have an unusual U height, you can.

34:54 CB: OK, so given the . . . There's another question that came in, but given the . . . I guess the ratification of the specification, and Kioxia has announced an EDSFF-E3 test vehicle, and we did it in conjunction with the Dell chassis, a prototype chassis, so what kind of happens from here and which kind of leads in . . . How long will this take to come to market? And then the question from someone in the audience is, "When would we expect broad adoption of E1 devices from major storage vendors?

35:43 BL: From a server point of view, it generally takes us a year to 18 months to go through a complete design cycle for a server. So, my guess is we'll start seeing stuff from late Q1, Q2 of 2022.

36:01 CB: OK, and there'll be lots of prototypes and samples and stuff going on. We'll definitely hear about development efforts throughout, between now and when they become a hit in the market, but these things do take some time.

36:20 BL: We could use samples today.

36:22 JG: See what we can do about that.

36:28 CB: Let's see, I don't see any other questions coming through from the Whova platform, and I seem to have lost my chatbox for . . .

36:40 JG: So, there was one more, if this august body has any expectations of the adoption of E1.

36:51 CB: Your brethren, human brethren.

36:54 JG: Indeed, and so, yeah, I would expect there are some server chassis are hard to come up, there are some available chassis already for E1, I believe from Supermicro and other sources. And you certainly are seeing that in other sessions add SMS from Facebook and others. So, I would expect E1 is going to happen a little bit sooner. They've been ready longer. We did just go through a reboot process, which is going to make E3 take a little bit longer to actually productized.

37:37 CB: OK, OK, let's see, let me stop sharing here and I'll see if my chatbox comes back here. OK, well, any other, I guess, closing comments, Bill or John?

37:58 BL: Just, if anybody has . . .

38:00 JG: You can post . . . I was going to say, you can post questions to the Whova app in the questions and answers, and I'm sure the panelist will look at these and the rest of the SMS and attempt to get answers out in time.

38:16 BL: And if anybody has any other ideas for other device types for E3, feel free to drop me a note. I would be very interested to see what ideas people have.

38:28 CB: Yes, yeah, definitely some interesting possibilities, and E3 definitely opens up that door for even more . . . OK, we are almost 10 minutes over our allotted time, so we'll probably go ahead and have to stop there. Thank you everybody for attending our session today. I know it's a late session on a Wednesday, especially for those on the East Coast and Central. And I guess at this point, we'll sign off. Thank you everybody.

39:01 BL: Thank you.

39:03 JG: Bye.

Dig Deeper on Flash memory and storage