<div dir="ltr"><div><div><div><div><div>-WD<br><br>The GPUs are sent data in chunks that they then process and return. The
time it takes a GPU to process a chunk can vary, so I assume the bottle
necks we were seeing was when several of the GPU cores would finish at
about the same time and request a new chunk and the chunk they needed
wasn't already in RAM, so the drive array would take a heavy hit. <br><br></div><div>Beyond that, I can't really give you a numerical value as to the amount of data they are dumping into the pcie bus. <br><br></div>
<div><br></div>-David<br><br></div><div>Ya, not sure an FPGA exists large enough for this, it would be interesting though. <br></div><br></div>While the process isn't entirely sequential, data previously processed is reused in the processing of other data, so that has kept us away from trying a cluster approach. <br>
<br></div>Depending on the problem, anywhere from minutes per iteration, to weeks per iteration. The weeks long problems are sitting at about 3TB I believe. We've only run benchmark problems on the SSDs up till now, so we haven't had the experience of seeing how they react once they start really getting full. <br>
<br></div>Sadly, 2TB of RAM would not be enough. I looked into this Dell box (<a href="http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=4231377#!tab=features">http://www8.hp.com/us/en/products/proliant-servers/product-detail.html?oid=4231377#!tab=features</a>) that would take 4TB, but the costs were insane and it can't support enough GPUs to actually do anything with the RAM...<br>
<br><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Dec 12, 2013 at 4:51 PM, David <span dir="ltr"><<a href="mailto:ainut@knology.net" target="_blank">ainut@knology.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Stephan,<br>
<br>
The system architecture will determine how quickly your huge tasks
get finished. (See SGI versus PC in previous email.)<br>
<br>
NUMA refers to design of the computer and for our interests,
determines how fast data moves around the buses.<br>
It is one of the reasons that, back in the 90's, SGI would whup
Sun's butt in execution times for data intensive tasks, even though
the Sun's were more expensive (Sun just had more advertising
money.) Anyway, a lot of that technology finally made it's way from
the workstation/minicomputer world into the PC world. <br>
<br>
GPU's are fantastic for this class of tasks. FPGA's would be much
faster still, but I doubt you could even get one big enough.<br>
<br>
While we're on the subject of optimizations :)<br>
Is the algorithm linear or can it be broken up such that it could
run several times faster if run on a cluster? It might be a
candidate, depending upon the algorithm. A PS3 can sometimes rival
GPU's as well but it is severely limited in onboard memory. A
cluster is ideal if, for example, your task is image analysis and
processor 1 can work on the upper left corner of the image, while
processor 2, the upper right, and so on, with each computer in the
cluster having it's own GPU. But if during runtime, each successive
calculation depends entirely on the one preceding it, then a cluster
would just slow you down.<br>
<br>
Yeah, you don't want the GPU's doing mundane work like file i/o,
just data crunching because that's what they're best at. That,
though, is entirely compiler dependent, unless your language allows
you to specify what runs where.<br>
<br>
May I ask what your runtimes are now?<br>
<br>
if you installed 2 terabytes of memory (mortgaging the state of
Alabama to do so), you could get unbelievable runtimes. :)<div><div class="h5"><br>
<br>
David M.<br>
<br>
<br>
<br>
<div>Stephan Henning wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>I'm honestly not sure David, I'm a bit confused as to if
NUMA is supposed to be something implemented at a die level or
at a system level. If at a die level, This box is running 2x
Xeon E5-2650. I've never gotten involved in the architecture
side of things so this is a bit foreign to me. <br>
<br>
The bulk of the computation for a run is done on the GPUs, if
it all ran on the CPU a single run would take months. The GPUs
have cut the runtime way down, but only part of the process is
GPU accelerated, and this file creation phase is one of the
parts that is not and is still being processed by the CPU. <span style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:left;font-style:normal;display:inline!important;font-weight:normal;float:none;line-height:20px;color:rgb(51,51,51);text-transform:none;font-size:14px;white-space:normal;font-family:Arial,sans-serif;word-spacing:0px"><br>
</span></div>
<span style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:left;font-style:normal;display:inline!important;font-weight:normal;float:none;line-height:20px;color:rgb(51,51,51);text-transform:none;font-size:14px;white-space:normal;font-family:Arial,sans-serif;word-spacing:0px"></span></div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Dec 12, 2013 at 3:40 PM, David
<span dir="ltr"><<a href="mailto:ainut@knology.net" target="_blank">ainut@knology.net</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Sometimes, the use of
DMA with the newest SATA III controllers actually slows it
down. Only a live test will show which is faster.<br>
Good point about the random access versus linear write,
though. My suspicion from his overview is that it is
linear.<br>
<br>
Linux is pretty good at managing disk optimization but if
tweaking is necessary, it can get very problematic because
as you shift the dynamics, the OS file system handler
shifts it's management algorithm. I believe that came
from the Cray world.<br>
<br>
Stephan, does your computer use the NUMA architecture?
There is a newer, slightly faster design but I can't
remember what it is. The reason I bring that up is that
in a former life I had to deal with data the size you
mentioned. As a test, I ran some benchmarks several times
with the best PC available at the time against a smallish
SGI pizza box. The program used was one I wrote and had
in production for quite a while. The PC would massage the
data and finish in about 12 1/2 hours. The SGI box did it
in 2 hours 45. I ran that test several times using data
that changed somewhat each month and timing results were
consistent. Just something to think about if run-times
are killing you at the moment.
<div>
<div><br>
<br>
<br>
<div>Arthur wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Here's something else to think about.
Is the program writing out the data in sequential
chuncks, or is it writing to random parts of the
file?
<div><br>
</div>
<div>With buffered writes the best speed up you
can get with a raid 0 array is if one disk is
writing something while the other disk is going
to the place where the next thing is going to be
written. If you're dealing with a bunch of
random writes, then ponying up for a few SSDs or
refactoring the code might be worth it.</div>
<div><br>
</div>
<div>Assuming that your raid controller isn't the
bottleneck. Some motherboard based raid
controllers use the CPU to do the work, and can
cause everything to slow down. (Side note.
Does anyone know if DMA works with those kind
of controllers?)</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Dec 12, 2013 at
11:26 AM, Stephan Henning <span dir="ltr"><<a href="mailto:shenning@gmail.com" target="_blank">shenning@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>The program will write out a file
of variable size, it's based on the
problem being run. Currently, it
writes out approximately 1.5TB for the
benchmark problem, most of that
contained in a single file, much too
large for a ramdisk. Unfortunately,
the problems have grown so large that
they can't be run in ram any more.
This is a GPU accelerated program so
this file gets modified very heavily
during the process of a run. <br>
<br>
</div>
Current testing is being done on a Raid0
of 5c Crucial 960G SSDs. This has proven
to be significantly faster than the old
array, but I am trying to determine
exactly how hard the disks are being
hammered so I can try and optimize the
hardware configuration. <br>
<br>
</div>
The program is compiled from source, but
I'm not involved in that process, I'd much
rather try and piggyback something and
monitor the process than try and go in and
have something added to source. <br>
<br>
</div>
I'll add parted and gparted to my list of
things to read up on, thanks. <br>
</div>
<div>
<div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On Thu, Dec 12,
2013 at 12:29 AM, David <span dir="ltr"><<a href="mailto:ainut@knology.net" target="_blank">ainut@knology.net</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"> Excellent
approach.
<div>
<div><br>
<br>
<br>
<div>Arthur wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">How big are
the files that you're
dealing with?
<div>If they're small you
can just make a ramdisk
and try running
everything in there.</div>
<div>It's not a final
solution, but between
that and strace you
should be able to see if
that's really the issue
or not.</div>
<div><br>
</div>
<div>Are you compiling
from source? If you
are, then there are a
bunch of debugging tools
you can use as well as
doing things like timing
individual commands, and
seeing how many times
each line of code is
run.</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On
Wed, Dec 11, 2013 at
10:48 PM, Stephan
Henning <span dir="ltr"><<a href="mailto:shenning@gmail.com" target="_blank">shenning@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">This is
a RedHat6 Enterprise
install.
<div><br>
</div>
<div>I don't think
htop has the data
I need, but I'll
check. I'm not
familiar with ntop
and I didn't
consider using
trace for this,
I'll check that as
well.</div>
<div><br>
</div>
<div>The goal is to
record read/write
rates and block
sizes. I'm pretty
sure I am
bottlenecking
against the drive
array, I'm hoping
I can get some
definitive answers
from this. </div>
</div>
<div>
<div>
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote">On
Wed, Dec 11,
2013 at 6:01
PM, David <span dir="ltr"><<a href="mailto:ainut@knology.net" target="_blank">ainut@knology.net</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
ntop might do
the trick, but
not available
in Fedora.
<div>
<div><br>
<br>
<br>
<div>David
wrote:<br>
</div>
<blockquote type="cite">
Can 'htop'
show open
files?<br>
<br>
For intensive
live net data,
look at
WireShark for
linux.<br>
<br>
<br>
<div>David
wrote:<br>
</div>
<blockquote type="cite">
If that's what
you're looking
for, there are
several (free)
programs you
could run from
the command
line in a
separate
window/screen
while your
program is
running that
give you all
you're asking
about. Sort
of an
equivalent to
Winblows
"System
Explorer."
What flavor or
Linux are you
using? <br>
<br>
David M.<br>
<br>
<br>
<div>Devin
Boyer wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Try
something like
"strace -T
myapp" or
"strace -T -c
myapp";
they'll show
the system
calls being
made and the
amount of time
spent in each.
It's slightly
different
information
than iostat,
but it may be
useful in
figuring out
what and where
your program
is performing
io access.</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On
Wed, Dec 11,
2013 at 3:37
PM, Stephan
Henning <span dir="ltr"><<a href="mailto:shenning@gmail.com" target="_blank">shenning@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">
<div>
<div>
<div>No,
iostat will
normally just
dump to the
terminal
window, but
I'd like to
pipe it's
output to a
file so I can
parse it
later. <br>
<br>
</div>
My end goal
here is to be
able to
generate a log
of iostat
output while I
run this
program, I'm
trying to
determine
exactly how
hard this
program is
hitting my
harddrive and
at what points
during it's
run does it
access the
drive the most
frequently. <br>
<br>
</div>
I've done
something
similar in
bash before,
but it is
rather clunky.
<br>
<br>
</div>
I'll take a
look at exec
and see if I
can use it. <br>
</div>
<div>
<div>
<div class="gmail_extra">
<br>
<br>
<div class="gmail_quote">On
Wed, Dec 11,
2013 at 4:46
PM, David <span dir="ltr"><<a href="mailto:ainut@knology.net" target="_blank">ainut@knology.net</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Do you need to
do anything
with the
results or
just need them
displayed?<br>
If you need to
manipulate the
results,
consider using
Perl, <br>
or if C or
C++,<br>
in your 'exec'
call, pipe the
output to a
file, then
just read that
file into your
program.<br>
Ain't UNIX
great?
<div><br>
<br>
David M.<br>
<br>
<br>
<div>Stephan
Henning wrote:<br>
</div>
</div>
<blockquote type="cite">
<div>
<div dir="ltr">
<div>I'd like
to take some
metrics with
iostat while I
have a
specific
program
running, is
there a way to
wrap iostat
around another
program (it is
called from
the command
line) so that
iostat ends
when the
program
finishes
running? <br>
<br>
</div>
I know I can
do it with a
bash script,
but I'm hoping
for a more
elegant
solution. <br>
</div>
<br>
<fieldset></fieldset>
<br>
</div>
<div>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</div>
</blockquote>
<br>
</div>
<br>
_______________________________________________<br>
General
mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General
mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General
mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Sincerely,<br>
Arthur Moore<br>
<a href="tel:%28256%29%20277-1001" value="+12562771001" target="_blank">(256)
277-1001</a><br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
Sincerely,<br>
Arthur Moore<br>
<a href="tel:%28256%29%20277-1001" value="+12562771001" target="_blank">(256)
277-1001</a><br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
</div>
</div>
</div>
<br>
_______________________________________________<br>
General mailing list<br>
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
General mailing list
<a href="mailto:General@lists.makerslocal.org" target="_blank">General@lists.makerslocal.org</a>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a></pre>
</blockquote>
<br>
</div></div></div>
<br>_______________________________________________<br>
General mailing list<br>
<a href="mailto:General@lists.makerslocal.org">General@lists.makerslocal.org</a><br>
<a href="http://lists.makerslocal.org/mailman/listinfo/general" target="_blank">http://lists.makerslocal.org/mailman/listinfo/general</a><br></blockquote></div><br></div>