Lynn McGuire <lmc@winsim.com>: Jun 01 04:36PM -0500 "C++ Performance: Common Wisdoms and Common "Wisdoms"" http://ithare.com/c-performance-common-wisdoms-and-common-wisdoms/ "Author: "No Bugs" Hare" "Job Title: Sarcastic Architect" "Hobbies: Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek" I have got to admit, I am very guilty of premature optimization. Lynn |
Jerry Stuckle <jstucklex@attglobal.net>: May 31 08:24PM -0400 On 5/31/2016 4:37 PM, Ian Collins wrote: > well as fiber. Intel 10/40 GbE chips are becoming mainstream on Xeon > server boards. I use 10 GbE copper in my home network which will > happily run at full speed. And you didn't answer the question, either. How like a troll. >> data concurrently at full speed. > They can, common 8 port SAS/SATA controllers use 8xPCIe lanes which > provide ample bandwidth for 8 or 16 SATA drives. Try again. Only controller can access the data bus at a time. 16 SATAs cannot transfer data concurrently at full speed. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
Ian Collins <ian-news@hotmail.com>: Jun 01 06:04PM +1200 On 06/ 1/16 12:24 PM, Jerry Stuckle wrote: >> server boards. I use 10 GbE copper in my home network which will >> happily run at full speed. > And you didn't answer the question, either. If you knew anything about the data centre space you would know about 100GbE networking. >> They can, common 8 port SAS/SATA controllers use 8xPCIe lanes which >> provide ample bandwidth for 8 or 16 SATA drives. > Try again. Only controller can access the data bus at a time. And controllers have 8, 16 and even 24 channels. > 16 SATAs > cannot transfer data concurrently at full speed. Well mine do. There's an esoteric concept in the storage world you probably haven't encountered in your 90s world called "RAID". -- Ian |
Juha Nieminen <nospam@thanks.invalid>: Jun 01 06:12AM >> project, be my guest. But don't be telling people that it's a problem >> for *all* C++ programmers, because it isn't. That's just a big fat lie. > Just because you don't doesn't mean it's not important. Then switch to another language, if it's so imporant to you. How hard is that to understand? My point is: Only a small fraction of C++ programmers need to work on codebases that large. It's not a reason to tell them all that C++ is bad because some megaproject X takes a long time to compile. That's a completely stupid complaint. It doesn't affect the majority of people. --- news://freenews.netfront.net/ - complaints: news@netfront.net --- |
scott@slp53.sl.home (Scott Lurndal): Jun 01 12:55PM >> provide ample bandwidth for 8 or 16 SATA drives. >Try again. Only controller can access the data bus at a time. 16 SATAs >cannot transfer data concurrently at full speed. The front-side bus has been obsolete for over a decade. Modern hardware has no problem transferring from 16 SATA _controllers_ simultaneously. |
legalize+jeeves@mail.xmission.com (Richard): Jun 01 04:34PM [Please do not mail me a copy of your followup] Even though I have Jerry in my KILL file, I am finding Ian's responses quite enjoyable humor :). -- "The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline> The Computer Graphics Museum <http://computergraphicsmuseum.org> The Terminals Wiki <http://terminals.classiccmp.org> Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com> |
Jerry Stuckle <jstucklex@attglobal.net>: Jun 01 12:58PM -0400 On 6/1/2016 2:12 AM, Juha Nieminen wrote: >> Just because you don't doesn't mean it's not important. > Then switch to another language, if it's so imporant to you. > How hard is that to understand? Why should I switch when it's not important to you? > bad because some megaproject X takes a long time to compile. That's a > completely stupid complaint. It doesn't affect the majority of people. > --- news://freenews.netfront.net/ - complaints: news@netfront.net --- Just because you don't have a problem with it doesn't mean other programmers don't. And yes, there are many programmers who work on larger projects and it affects them. It doesn't take long. It's not my problem that the biggest program you ever wrote was "Hello World". -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
Jerry Stuckle <jstucklex@attglobal.net>: Jun 01 01:11PM -0400 On 6/1/2016 2:04 AM, Ian Collins wrote: >> And you didn't answer the question, either. > If you knew anything about the data centre space you would know about > 100GbE networking. I know aboutdata center space. But that is limited to the data center. How many places in the world have 100Gbs networking external to a data center? >>> provide ample bandwidth for 8 or 16 SATA drives. >> Try again. Only controller can access the data bus at a time. > And controllers have 8, 16 and even 24 channels. Of which only one can access the data bus at a time. >> 16 SATAs >> cannot transfer data concurrently at full speed. > Well mine do. You only *think* yours do. > There's an esoteric concept in the storage world you probably haven't > encountered in your 90s world called "RAID". That's right. And here's some esoteric concepts in the storage world you probably haven't encountered in your 70's world: Seek command Get response Read/write block() Get response Even on an SSD, the above takes time - several cycles to interpret and process the commands and send the response. With a hard disk, even RAID, it takes even longer. Now the data may already be in the RAID's cache (in which case it will operate at speeds near to - but not quite as fast as - an SSD), but even then the cache size is limited. And even an 8 disk RAID can't keep up; eventually it will have to go to disk, and even the fastest disks max out at around 175MB/s. Plus, if the controller uses DMA to transfer to memory, it has to interleave memory access with the processor. Find if the processor isn't doing anything, but not so good if it is. And if it doesn't use DMA, the processor has to perform the transfer. You only THINK you are getting full speed. Even manufacturers admit the data transfer rates they quote are are theoretical maximums. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
Jerry Stuckle <jstucklex@attglobal.net>: Jun 01 01:13PM -0400 On 6/1/2016 8:55 AM, Scott Lurndal wrote: > The front-side bus has been obsolete for over a decade. > Modern hardware has no problem transferring from 16 SATA _controllers_ > simultaneously. Try again. Only one can access the bus at a time. You do not have 16 separate address buses to the same memory. And even the memory chips only have one address and one data bus. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
scott@slp53.sl.home (Scott Lurndal): Jun 01 05:37PM >Try again. Only one can access the bus at a time. You do not have 16 >separate address buses to the same memory. And even the memory chips >only have one address and one data bus. What bus? Everything is either connected point to point through a non-blocking crossbar, or in the case of Intel processors, all elements sit on a pair of rings. The bandwidth of the ring is sufficient to support all active data sources and sinks active at the same time. That means all devices can transfer data simulatanously to memory (and the cache hierarchy will snoop the accesses and invalidate any cached data along the way). There are multiple memory controllers and multiple dimms on the memory subsystem side, and the sufficent bandwidth to support the crossbar and/or ring structures in the unCore/RoC. Your key word for the day is "interleave". |
scott@slp53.sl.home (Scott Lurndal): Jun 01 05:58PM >Get response >Read/write block() >Get response Actually, the 70's were the last point in time where anyone, ever, did separate seeks and read/writes. They've been combined into a single operation for 30+ years on every hardware type that matters (IDE, SATA, SCSI, SAS, FC, NvME). Even in the 70's, they were combined for everyone except IBM's CKD (Burroughs stopped using discrete seeks in the late 60's). >Even on an SSD, the above takes time - several cycles to interpret and >process the commands and send the response. Even on an SSD attached to a bog-standard SATA controller, that's not true. One submits a FIS (Frame Information Structure) to the controller that identifies the direction of transfer and the desired starting sector on the device. The controller will DMA data to/from memory independently of the CPU, even handling non-physically-contiguous memory regions. The driver can queue many FIS's to the controller hardware. A simple port multiplier with a handful of modern SSD's can saturate 6Gbps third-generation SATA lanes. Modern systems uses NVME controllers instead of SATA which has a new driver interface that provides even lower overhead (64k queues, 64k entries deep) and much better throughput with higher bandwidth. https://en.wikipedia.org/wiki/NVM_Express >interleave memory access with the processor. Find if the processor >isn't doing anything, but not so good if it is. And if it doesn't use >DMA, the processor has to perform the transfer. You really don't understand processor design. Firstly, the caches (and most systems see 90%+ hit rates) somewhat isolate the processors from the memory subsystem. Secondly, the bandwidth to the memory subsystem is designed from the start to be sufficient to support simultaneous access from the cache subsystem (refills and evictions of dirty lines) and the I/O subsystem hardware. When we were building supercomputers, where memory bandwidth is king, the requirement was something like 1 byte per flop, and we were able to do that using QDR infiniband as the interconnect with modern memory controllers. We regularly measure the actual datarates to ensure that we meet the line rate under load. Our 40Gbps interfaces get 36+Gbps for TCP packets (the 64b/66b encoding and TCP headers eat some of the bandwidth), measured and sustained, with full processor utilization (there are 48 cores). |
Jerry Stuckle <jstucklex@attglobal.net>: Jun 01 04:08PM -0400 On 6/1/2016 1:58 PM, Scott Lurndal wrote: > matters (IDE, SATA, SCSI, SAS, FC, NvME). Even in the 70's, they > were combined for everyone except IBM's CKD (Burroughs stopped using > discrete seeks in the late 60's). Sometimes yes, sometimes no. Both exist, and are used. > driver can queue many FIS's to the controller hardware. A simple port > multiplier with a handful of modern SSD's can saturate 6Gbps > third-generation SATA lanes. That is correct. But it still takes several cycles to do so. The response is not immediate. Just like most processor instructions take multiple cycles. > interface that provides even lower overhead (64k queues, 64k entries deep) > and much better throughput with higher bandwidth. > https://en.wikipedia.org/wiki/NVM_Express Which does not change the facts. > is designed from the start to be sufficient to support simultaneous access > from the cache subsystem (refills and evictions of dirty lines) and > the I/O subsystem hardware. Oh, yes, I understand it all right. Your 90% hit rate is for instruction retrieval, not data access. And even then, the caches must be loaded, which requires memory access. And no memory system supports concurrent access. The maximum speed can occur in bursts - but even the disk manufacturers admit it's only a theoretical maximum, not real-world. > the requirement was something like 1 byte per flop, and we were able > to do that using QDR infiniband as the interconnect with modern memory > controllers. 1 byte per floating operation per second? That doesn't even make sense. > rate under load. Our 40Gbps interfaces get 36+Gbps for TCP packets > (the 64b/66b encoding and TCP headers eat some of the bandwidth), measured and sustained, > with full processor utilization (there are 48 cores). Yes, the packets may be 36Gbs, but the data rate is not. And it is not the 40Gbs you previously claimed. And for how long did you measure it? A few milliseconds? And it still doesn't measure up to mainframe speeds. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
Jerry Stuckle <jstucklex@attglobal.net>: Jun 01 04:22PM -0400 On 6/1/2016 1:37 PM, Scott Lurndal wrote: > memory controllers and multiple dimms on the memory subsystem side, and the > sufficent bandwidth to support the crossbar and/or ring structures in > the unCore/RoC. Your key word for the day is "interleave". Here's something simple for you to understand. It's not a complete description, but I'm not going to write a novel on how memory works. Memory chips have two sets of lines. They have address lines and they have data lines. The address lines are used to access a particular memory location, and the data lines are used to read or write to that location. Now, the memory chips in the computer are connected together. Some address lines are directly in parallel, while others go through additional circuitry to select the desired chip(s). But all chips share at least some address lines. The same with data lines - chips can share the same data lines, but only one chip is active at one time for any data line. These are called the address and data buses, respectively. Only one device (including the CPUs) can place an address on the bus at any one time. Devices can interleave - but that means that all other devices must wait while that device is accessing the memory. It doesn't make any difference how many DIMMS and now many memory controllers there are - the chip's design restricts access to one unit at a time. So no, you cannot run all of your I/O concurrently at full speed - no matter what you claim. But then it's just what I would expect from you based on your earlier claims. -- ================== Remove the "x" from my email address Jerry Stuckle jstucklex@attglobal.net ================== |
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page. To unsubscribe from this group and stop receiving emails from it send an email to comp.lang.c+++unsubscribe@googlegroups.com. |
No comments:
Post a Comment