[ML-General] Cluster Computing

Brian Oborn linuxpunk at gmail.com
Thu Jan 22 17:54:44 CST 2015


I would be tempted to just copy what the in-house cluster uses for
provisioning. That will save you a lot of time and make it easier to
integrate with the larger cluster if you choose to do so. Although it can
be tempting to get hardware in your hands, I've done a lot of work with
building all of the fiddly Linux bits (DHCP+TFTP+root on NFS+NFS home) in
several VMs before moving to real hardware. You can set up a private
VM-only network between your head node and the slave nodes and work from
there.

On Thu, Jan 22, 2015 at 5:31 PM, Michael Carroll <carroll.michael at gmail.com>
wrote:

> So is your concern with provisioning and setup or with actual job
> distribution?
>
> ~mc mobile
>
> On Jan 22, 2015, at 17:15, Stephan Henning <shenning at gmail.com> wrote:
>
> This is a side project for the office. Sadly, most of this type of work
> can't be farmed out to external clusters, otherwise we would use it for
> that. We do currently utilize AWS for some of this type work, but only for
> internal R&D.
>
> This all started when the Intel Edison got released. Some of us were
> talking about it one day and realized that it *might* have *just enough* processing
> power and ram to handle some of our smaller problems. We've talked about it
> some more and the discussion has evolved to the point where I've been
> handed some hours and a small amount of funding to try and implement a
> 'cluster-in-a-box'.
>
> The main idea being to rack a whole bunch of mini-itx boards on edge into
> a 4U chassis (yes, they will fit). Assuming a 2" board-board clearance
> across the width of the chassis and 1" spacing back-to-front down the depth
> of a box, I think I could fit 27 boards into a 36" deep chassis, with
> enough room for the power supplies and interconnects.
>
> Utilizing embedded motherboards with Atom C2750 8-core CPU's and 16gb of
> ram per board, that should give me a pretty substantial cluster to play
> with.  Obviously I am starting small, probably with two or three boards
> running Q2900 4-core cpus until I can get the software side worked out.
>
> The software-infrastructure side is the part I'm having a hard time with.
> While there are options out there for how to do this, they are all
> relatively involved and there isn't an obvious 'best' choice to me right
> now. Currently our in-house HPC cluster utilizes HTCondor for it's
> backbone, so I would like to maintain some sort of connection to it.
> Otherwise, I'm seeing options in the Beowulf and Rocks areas that could be
> useful, I'm just not sure where to start in all honesty.
>
> At the end of the day this needs to be relatively easy for us to manage
> (time spent working on the cluster is time spent not billing the customer)
> while being easy enough to add notes to, assuming this is a success and I
> get the OK to expand it to a full 42U racks worth.
>
>
> Our current cluster is almost always fully utilized. Currently we've got
> about a 2 month backlog of jobs on it.
>
>
> On Thu, Jan 22, 2015 at 4:55 PM, Brian Oborn <linuxpunk at gmail.com> wrote:
>
>> If you can keep your utilization high, then your own hardware can be much
>> more cost effective. However, if you end up paying depreciation and
>> maintenance on a cluster that's doing nothing most of the time you'd be
>> better off in the cloud.
>>
>> On Thu, Jan 22, 2015 at 4:50 PM, Michael Carroll <
>> carroll.michael at gmail.com> wrote:
>>
>>> Depending on what you are going to do, it seems like it would make more
>>> sense to use AWS or Digital Ocean these days, rather than standing up your
>>> own hardware. Maintaining your own hardware sucks.
>>>
>>> That being said, if you are doing something that requires InfiniBand,
>>> then hardware is your only choice :)
>>>
>>> ~mc
>>>
>>> On Thu, Jan 22, 2015 at 4:43 PM, Joshua Pritt <ramgarden at gmail.com>
>>> wrote:
>>>
>>>> My friends and I installed a Beowulf cluster on a closet full of
>>>> Pentium 75 Mhz machines we were donated just for fun many years ago back
>>>> when Beowulf was just getting popular.  We never figured out anything to do
>>>> with it though...
>>>>
>>>> On Thu, Jan 22, 2015 at 5:31 PM, Brian Oborn <linuxpunk at gmail.com>
>>>> wrote:
>>>>
>>>>> In my previous job I set up several production Beowulf clusters,
>>>>> mainly for particle physics simulations and this has been an area of
>>>>> intense interest for me. I would be excited to help you out and I think I
>>>>> could provide some good assistance.
>>>>>
>>>>> Brian Oborn (aka bobbytables)
>>>>>
>>>>>
>>>>> On Thu, Jan 22, 2015 at 4:25 PM, Stephan Henning <shenning at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Does anyone on the mailing list have any experience with setting up a
>>>>>> cluster computation system? If so and you are willing to humor my
>>>>>> questions, I'd greatly appreciate a few minutes of your time.
>>>>>>
>>>>>> -stephan
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at lists.makerslocal.org
>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at lists.makerslocal.org
>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at lists.makerslocal.org
>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>
>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> General at lists.makerslocal.org
>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>
>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.makerslocal.org/pipermail/general/attachments/20150122/b5156dcf/attachment.html>


More information about the General mailing list