[ML-General] Cluster Computing

Stephan Henning shenning at gmail.com
Tue Feb 3 16:55:52 CST 2015


@ Hunter,

Yes, there are. I picked up an Atom C2750 board, a Celeron Q2900 board, and
I've got a board with an i7-4790s  that I'm going to put them up against.
I'm going to try these out first, might give the NUCs a shot in the next
round, depending on how these do.

@ Webdawg,

I haven't picked a platform yet. Learning toward Rocks or Kerrighed at the
moment.

I'm not going to bother with ECC at the moment when benchmarking. I'm
trying to reduce the variables as much as possible, and for the most part
these will only see relatively short duration problems. I will re-evaluate
it's usefulness later as I'm not entirely convinced of it's usefulness when
working with large numbers of small datasets. Do you have some thoughts
here to share? If it is worth doing I'm certainly willing to use it.

@ david, Shae

Yup, I remember when that was the hot thing to do, install linux on your
ps3. From what I remember, there are a lot of 'ifs' involved in doing so
though. IF you have an older hardware revision and IF it hasn't been
updated to a later software build, then you might be able to get linux on
it. As Shae pointed out, I wouldn't really look forward to trying to
recompile our program for the Cell engine.


On Tue, Feb 3, 2015 at 4:28 PM, Shae <shae.erisson at gmail.com> wrote:

> The PS3 has a number of downsides, which can mostly be summarized with the
> lack of support since Sony removed the OtherOS option.
>
> Secondarily, it's difficult to write a compiler smart enough to take good
> advantage of the PSUs on the Cell Broadband Engine.
>
> If you decide you want to learn about doing highly parallel computing on
> the Cell, I can bring my BladeCenter to the shop and fire up one (or many)
> of my QS20 blades for experiments.
> But honestly, the Cell is lots of trouble without so much benefit.
>
>
> On Tue Feb 03 2015 at 4:14:46 PM david <ainut at knology.net> wrote:
>
>>  IF and it's a big IF, your problem lends itself to a pure distributed
>> processing paradigm (think Cray...), a very low cost setup with phenomenal
>> compute speeds is the Sony Playstation 3, believe it or not.  You can find
>> them really cheap nowadays,.  Network a few of them together, install
>> LINUX/UNIX on them (might be available out there) and setup the Cray-type
>> compiler (from SGI) and you'll have a honking system.  In the right problem
>> domain, 5 of those would outperform hundreds of the pico-computers.
>>
>>
>>
>>
>>
>> On 02/03/2015 03:57 PM, Stephan Henning wrote:
>>
>> There was a group that did it a while back. I want to say they did it
>> with Atom processors. Ended up with 400+ nodes in a 10U rack I think.
>>
>> On Tue, Feb 3, 2015 at 3:55 PM, Erik Arendall <earendall at gmail.com>
>> wrote:
>>
>>> This would be a cool project to develop a module board that contains the
>>> cpu/gpu of choice and required ram for use. then the modules could plug in
>>> to a supervisory control node.
>>>
>>> On Tue, Feb 3, 2015 at 3:50 PM, Stephan Henning <shenning at gmail.com>
>>> wrote:
>>>
>>>> Hey Hunter,
>>>>
>>>>  Well, with the Edison, it wouldn't be 27 devices, it would be closer
>>>> to 400 :)
>>>>
>>>>  I *think* I can fit 27 mini-itx motherboards in a 4U chassis (maybe
>>>> only 21-24, depending on heatsink height). For the raspi's or the Edisons
>>>> to be viable they would need to beat that baseline on a flop/watt vs $$
>>>> comparison. Even in that case, the low RAM amount limits their usefulness.
>>>>
>>>> On Tue, Feb 3, 2015 at 3:44 PM, Hunter Fuller <hfuller at pixilic.com>
>>>> wrote:
>>>>
>>>>> 27 devices in a metal box will work fine, provided there is also a
>>>>> fairly robust AP in that box. I would personally still lean toward USB
>>>>> Ethernet though. But that increases your devices size and complexity... Hm.
>>>>>
>>>>> As far as PXE boot, since there is no wired Ethernet available, I
>>>>> doubt that is a thing. However, you can Mount the internal storage as
>>>>> /boot, and have a script run that rsyncs the /boot fs between the boxes and
>>>>> a server. The rest can be achieved by using an NFS volume as your root
>>>>> partition. This setup is commonly done on armies of raspberry pis.
>>>>>
>>>>> There wouldn't be much difference between original prep on this and
>>>>> originally preparing several SD cards. In one case, you have to connect
>>>>> each device to a provisioning station. In the other case,you connect each
>>>>> SD card to the same station. Not much different, and once you boot one
>>>>> time, you can do the maintenance in an automated fashion across all nodes.
>>>>>  On Jan 23, 2015 9:36 AM, "Michael Carroll" <carroll.michael at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Stephan,
>>>>>>
>>>>>>  I didn't realize that the Edison was wifi-only.  I'm interested to
>>>>>> hear how 27 wifi devices in a metal box will work?
>>>>>>
>>>>>>  Also, do you know if the edison can pxeboot?  I think that's the
>>>>>> best approach for booting a whole bunch of homogeneous computers, it would
>>>>>> certainly be more maintenance overhead without that capability.
>>>>>>
>>>>>>  ~mc
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 22, 2015 at 11:04 PM, Stephan Henning <shenning at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> @Erik
>>>>>>> Well, the raspi and beaglebone have less ram than the Edison. I'll
>>>>>>> have to take a look at the Rock, the Pro version offers 2gb, but since the
>>>>>>> Edison is an x86 platform it is advantageous in many ways.
>>>>>>>
>>>>>>>  @Tim
>>>>>>>  Ya, that looks very similar. I'll give it a read through in the
>>>>>>> morning. I'll make sure to keep you updated.
>>>>>>>
>>>>>>> On Thu, Jan 22, 2015 at 10:11 PM, Erik Arendall <earendall at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Not sure of your ram requirements, but there are options in the
>>>>>>>> RasPI, beaglebone black, and check out Radxa Rock.
>>>>>>>>
>>>>>>>> http://radxa.com/Rock
>>>>>>>>
>>>>>>>> Erik
>>>>>>>>   On Jan 22, 2015 10:07 PM, "Tim H" <crashcartpro at gmail.com> wrote:
>>>>>>>>
>>>>>>>>>   This sounds like a fun project!
>>>>>>>>> Reminds me of this guy:
>>>>>>>>> http://www.pcworld.idg.com.au/article/349862/seamicro_cloud_
>>>>>>>>> server_sports_512_atom_processors/
>>>>>>>>>  (cluster of low power processors in a single box)
>>>>>>>>>
>>>>>>>>>  I'd also been kicking a similar idea around for the last year,
>>>>>>>>> but no real ability to do it, so I'd love to see your progress!
>>>>>>>>>  -Tim
>>>>>>>>>
>>>>>>>>> On Thu, Jan 22, 2015 at 9:10 PM, Stephan Henning <
>>>>>>>>> shenning at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> In some ways, yes. The biggest limitation with the Edison for me
>>>>>>>>>> is the ram. While there is a lot that we could run on it, it's restricts
>>>>>>>>>> them enough that I don't think it would be as useful, which changes alters
>>>>>>>>>> the true 'cost' of the setup.
>>>>>>>>>>
>>>>>>>>>>  Granted, you could probably fit a few hundred of them in a 4U
>>>>>>>>>> chassis. It would be an interesting experiment in integration though since
>>>>>>>>>> they have no ethernet interface, only wireless.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 22, 2015 at 9:02 PM, Erik Arendall <
>>>>>>>>>> earendall at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've often kicked the idea around doing this with Arduinos and
>>>>>>>>>>> FPGAs. I guess you could also do it with Intel Edison modules. Cost wise
>>>>>>>>>>> the Edison modules would better than a PC.
>>>>>>>>>>>
>>>>>>>>>>> Erik
>>>>>>>>>>>   On Jan 22, 2015 6:44 PM, "Stephan Henning" <shenning at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>  @mc
>>>>>>>>>>>> Both. If I start to scale this to a large number of nodes I can
>>>>>>>>>>>> foresee many headaches if I can't easily push modifications and updates.
>>>>>>>>>>>> From the job distribution side, it would be great to maintain compatibility
>>>>>>>>>>>> with condor, I'm just unsure how well it will operate if it has to hand
>>>>>>>>>>>> jobs off to the head node that then get distributed out further.
>>>>>>>>>>>>
>>>>>>>>>>>>  @ Brian
>>>>>>>>>>>> Our current cluster is made up of discrete machines only about
>>>>>>>>>>>> 20 nodes. Many of the nodes are actual user workstations that are brought
>>>>>>>>>>>> in when inactive. There is no uniform provisioning method. Every box has a
>>>>>>>>>>>> slightly different hardware configuration. Thankfully we do a pretty good
>>>>>>>>>>>> job keeping all required software aligned to the sam version.
>>>>>>>>>>>>
>>>>>>>>>>>>  The VM idea is interesting. I hadn't considered that. I will
>>>>>>>>>>>> need to think on that and how I might be able to implement it.
>>>>>>>>>>>>
>>>>>>>>>>>>  @david
>>>>>>>>>>>> Yup, I'm fully aware this level of distributed computing is
>>>>>>>>>>>> only good for specific cases. I understand your position, thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> -stephan
>>>>>>>>>>>>
>>>>>>>>>>>>  ---———---•---———---•---———---
>>>>>>>>>>>> Sent from a mobile device, please excuse the spelling and
>>>>>>>>>>>> brevity.
>>>>>>>>>>>>
>>>>>>>>>>>> On Jan 22, 2015, at 5:54 PM, Brian Oborn <linuxpunk at gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>   I would be tempted to just copy what the in-house cluster
>>>>>>>>>>>> uses for provisioning. That will save you a lot of time and make it easier
>>>>>>>>>>>> to integrate with the larger cluster if you choose to do so. Although it
>>>>>>>>>>>> can be tempting to get hardware in your hands, I've done a lot of work with
>>>>>>>>>>>> building all of the fiddly Linux bits (DHCP+TFTP+root on NFS+NFS home) in
>>>>>>>>>>>> several VMs before moving to real hardware. You can set up a private
>>>>>>>>>>>> VM-only network between your head node and the slave nodes and work from
>>>>>>>>>>>> there.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Michael Carroll <
>>>>>>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>  So is your concern with provisioning and setup or with
>>>>>>>>>>>>> actual job distribution?
>>>>>>>>>>>>>
>>>>>>>>>>>>> ~mc mobile
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Jan 22, 2015, at 17:15, Stephan Henning <shenning at gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>   This is a side project for the office. Sadly, most of this
>>>>>>>>>>>>> type of work can't be farmed out to external clusters, otherwise we would
>>>>>>>>>>>>> use it for that. We do currently utilize AWS for some of this type work,
>>>>>>>>>>>>> but only for internal R&D.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  This all started when the Intel Edison got released. Some of
>>>>>>>>>>>>> us were talking about it one day and realized that it *might*
>>>>>>>>>>>>> have *just enough* processing power and ram to handle some of
>>>>>>>>>>>>> our smaller problems. We've talked about it some more and the discussion
>>>>>>>>>>>>> has evolved to the point where I've been handed some hours and a small
>>>>>>>>>>>>> amount of funding to try and implement a 'cluster-in-a-box'.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  The main idea being to rack a whole bunch of mini-itx boards
>>>>>>>>>>>>> on edge into a 4U chassis (yes, they will fit). Assuming a 2" board-board
>>>>>>>>>>>>> clearance across the width of the chassis and 1" spacing back-to-front down
>>>>>>>>>>>>> the depth of a box, I think I could fit 27 boards into a 36" deep chassis,
>>>>>>>>>>>>> with enough room for the power supplies and interconnects.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Utilizing embedded motherboards with Atom C2750 8-core CPU's
>>>>>>>>>>>>> and 16gb of ram per board, that should give me a pretty substantial cluster
>>>>>>>>>>>>> to play with.  Obviously I am starting small, probably with two or three
>>>>>>>>>>>>> boards running Q2900 4-core cpus until I can get the software side worked
>>>>>>>>>>>>> out.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  The software-infrastructure side is the part I'm having a
>>>>>>>>>>>>> hard time with. While there are options out there for how to do this, they
>>>>>>>>>>>>> are all relatively involved and there isn't an obvious 'best' choice to me
>>>>>>>>>>>>> right now. Currently our in-house HPC cluster utilizes HTCondor for it's
>>>>>>>>>>>>> backbone, so I would like to maintain some sort of connection to it.
>>>>>>>>>>>>> Otherwise, I'm seeing options in the Beowulf and Rocks areas that could be
>>>>>>>>>>>>> useful, I'm just not sure where to start in all honesty.
>>>>>>>>>>>>>
>>>>>>>>>>>>>  At the end of the day this needs to be relatively easy for
>>>>>>>>>>>>> us to manage (time spent working on the cluster is time spent not billing
>>>>>>>>>>>>> the customer) while being easy enough to add notes to, assuming this is a
>>>>>>>>>>>>> success and I get the OK to expand it to a full 42U racks worth.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>  Our current cluster is almost always fully utilized.
>>>>>>>>>>>>> Currently we've got about a 2 month backlog of jobs on it.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:55 PM, Brian Oborn <
>>>>>>>>>>>>> linuxpunk at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you can keep your utilization high, then your own hardware
>>>>>>>>>>>>>> can be much more cost effective. However, if you end up paying depreciation
>>>>>>>>>>>>>> and maintenance on a cluster that's doing nothing most of the time you'd be
>>>>>>>>>>>>>> better off in the cloud.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:50 PM, Michael Carroll <
>>>>>>>>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Depending on what you are going to do, it seems like it
>>>>>>>>>>>>>>> would make more sense to use AWS or Digital Ocean these days, rather than
>>>>>>>>>>>>>>> standing up your own hardware. Maintaining your own hardware sucks.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  That being said, if you are doing something that requires
>>>>>>>>>>>>>>> InfiniBand, then hardware is your only choice :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>  ~mc
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:43 PM, Joshua Pritt <
>>>>>>>>>>>>>>> ramgarden at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My friends and I installed a Beowulf cluster on a closet
>>>>>>>>>>>>>>>> full of Pentium 75 Mhz machines we were donated just for fun many years ago
>>>>>>>>>>>>>>>> back when Beowulf was just getting popular.  We never figured out anything
>>>>>>>>>>>>>>>> to do with it though...
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Brian Oborn <
>>>>>>>>>>>>>>>> linuxpunk at gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In my previous job I set up several production Beowulf
>>>>>>>>>>>>>>>>> clusters, mainly for particle physics simulations and this has been an area
>>>>>>>>>>>>>>>>> of intense interest for me. I would be excited to help you out and I think
>>>>>>>>>>>>>>>>> I could provide some good assistance.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>  Brian Oborn (aka bobbytables)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:25 PM, Stephan Henning <
>>>>>>>>>>>>>>>>> shenning at gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does anyone on the mailing list have any experience with
>>>>>>>>>>>>>>>>>> setting up a cluster computation system? If so and you are willing to humor
>>>>>>>>>>>>>>>>>> my questions, I'd greatly appreciate a few minutes of your time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>  -stephan
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>   _______________________________________________
>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> General mailing list
>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> General mailing list
>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> General mailing list
>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> General at lists.makerslocal.org
>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> General at lists.makerslocal.org
>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at lists.makerslocal.org
>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at lists.makerslocal.org
>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at lists.makerslocal.org
>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>
>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> General at lists.makerslocal.org
>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>
>>
>>
>>
>> _______________________________________________
>> General mailing listGeneral at lists.makerslocal.orghttp://lists.makerslocal.org/mailman/listinfo/general
>>
>>
>> --
>> This headspace for rent
>>
>>  _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.makerslocal.org/pipermail/general/attachments/20150203/673c7834/attachment.html>


More information about the General mailing list