[ML-General] Cluster Computing

Erik Arendall earendall at gmail.com
Thu Jan 22 21:02:13 CST 2015


I've often kicked the idea around doing this with Arduinos and FPGAs. I
guess you could also do it with Intel Edison modules. Cost wise the Edison
modules would better than a PC.

Erik
On Jan 22, 2015 6:44 PM, "Stephan Henning" <shenning at gmail.com> wrote:

> @mc
> Both. If I start to scale this to a large number of nodes I can foresee
> many headaches if I can't easily push modifications and updates. From the
> job distribution side, it would be great to maintain compatibility with
> condor, I'm just unsure how well it will operate if it has to hand jobs off
> to the head node that then get distributed out further.
>
> @ Brian
> Our current cluster is made up of discrete machines only about 20 nodes.
> Many of the nodes are actual user workstations that are brought in when
> inactive. There is no uniform provisioning method. Every box has a slightly
> different hardware configuration. Thankfully we do a pretty good job
> keeping all required software aligned to the sam version.
>
> The VM idea is interesting. I hadn't considered that. I will need to think
> on that and how I might be able to implement it.
>
> @david
> Yup, I'm fully aware this level of distributed computing is only good for
> specific cases. I understand your position, thanks.
>
> -stephan
>
> ---———---•---———---•---———---
> Sent from a mobile device, please excuse the spelling and brevity.
>
> On Jan 22, 2015, at 5:54 PM, Brian Oborn <linuxpunk at gmail.com> wrote:
>
> I would be tempted to just copy what the in-house cluster uses for
> provisioning. That will save you a lot of time and make it easier to
> integrate with the larger cluster if you choose to do so. Although it can
> be tempting to get hardware in your hands, I've done a lot of work with
> building all of the fiddly Linux bits (DHCP+TFTP+root on NFS+NFS home) in
> several VMs before moving to real hardware. You can set up a private
> VM-only network between your head node and the slave nodes and work from
> there.
>
> On Thu, Jan 22, 2015 at 5:31 PM, Michael Carroll <
> carroll.michael at gmail.com> wrote:
>
>> So is your concern with provisioning and setup or with actual job
>> distribution?
>>
>> ~mc mobile
>>
>> On Jan 22, 2015, at 17:15, Stephan Henning <shenning at gmail.com> wrote:
>>
>> This is a side project for the office. Sadly, most of this type of work
>> can't be farmed out to external clusters, otherwise we would use it for
>> that. We do currently utilize AWS for some of this type work, but only for
>> internal R&D.
>>
>> This all started when the Intel Edison got released. Some of us were
>> talking about it one day and realized that it *might* have *just enough* processing
>> power and ram to handle some of our smaller problems. We've talked about it
>> some more and the discussion has evolved to the point where I've been
>> handed some hours and a small amount of funding to try and implement a
>> 'cluster-in-a-box'.
>>
>> The main idea being to rack a whole bunch of mini-itx boards on edge into
>> a 4U chassis (yes, they will fit). Assuming a 2" board-board clearance
>> across the width of the chassis and 1" spacing back-to-front down the depth
>> of a box, I think I could fit 27 boards into a 36" deep chassis, with
>> enough room for the power supplies and interconnects.
>>
>> Utilizing embedded motherboards with Atom C2750 8-core CPU's and 16gb of
>> ram per board, that should give me a pretty substantial cluster to play
>> with.  Obviously I am starting small, probably with two or three boards
>> running Q2900 4-core cpus until I can get the software side worked out.
>>
>> The software-infrastructure side is the part I'm having a hard time with.
>> While there are options out there for how to do this, they are all
>> relatively involved and there isn't an obvious 'best' choice to me right
>> now. Currently our in-house HPC cluster utilizes HTCondor for it's
>> backbone, so I would like to maintain some sort of connection to it.
>> Otherwise, I'm seeing options in the Beowulf and Rocks areas that could be
>> useful, I'm just not sure where to start in all honesty.
>>
>> At the end of the day this needs to be relatively easy for us to manage
>> (time spent working on the cluster is time spent not billing the customer)
>> while being easy enough to add notes to, assuming this is a success and I
>> get the OK to expand it to a full 42U racks worth.
>>
>>
>> Our current cluster is almost always fully utilized. Currently we've got
>> about a 2 month backlog of jobs on it.
>>
>>
>> On Thu, Jan 22, 2015 at 4:55 PM, Brian Oborn <linuxpunk at gmail.com> wrote:
>>
>>> If you can keep your utilization high, then your own hardware can be
>>> much more cost effective. However, if you end up paying depreciation and
>>> maintenance on a cluster that's doing nothing most of the time you'd be
>>> better off in the cloud.
>>>
>>> On Thu, Jan 22, 2015 at 4:50 PM, Michael Carroll <
>>> carroll.michael at gmail.com> wrote:
>>>
>>>> Depending on what you are going to do, it seems like it would make more
>>>> sense to use AWS or Digital Ocean these days, rather than standing up your
>>>> own hardware. Maintaining your own hardware sucks.
>>>>
>>>> That being said, if you are doing something that requires InfiniBand,
>>>> then hardware is your only choice :)
>>>>
>>>> ~mc
>>>>
>>>> On Thu, Jan 22, 2015 at 4:43 PM, Joshua Pritt <ramgarden at gmail.com>
>>>> wrote:
>>>>
>>>>> My friends and I installed a Beowulf cluster on a closet full of
>>>>> Pentium 75 Mhz machines we were donated just for fun many years ago back
>>>>> when Beowulf was just getting popular.  We never figured out anything to do
>>>>> with it though...
>>>>>
>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Brian Oborn <linuxpunk at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> In my previous job I set up several production Beowulf clusters,
>>>>>> mainly for particle physics simulations and this has been an area of
>>>>>> intense interest for me. I would be excited to help you out and I think I
>>>>>> could provide some good assistance.
>>>>>>
>>>>>> Brian Oborn (aka bobbytables)
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 22, 2015 at 4:25 PM, Stephan Henning <shenning at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Does anyone on the mailing list have any experience with setting up
>>>>>>> a cluster computation system? If so and you are willing to humor my
>>>>>>> questions, I'd greatly appreciate a few minutes of your time.
>>>>>>>
>>>>>>> -stephan
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> General at lists.makerslocal.org
>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at lists.makerslocal.org
>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at lists.makerslocal.org
>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at lists.makerslocal.org
>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>
>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> General at lists.makerslocal.org
>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.makerslocal.org/pipermail/general/attachments/20150122/f1604d32/attachment.html>


More information about the General mailing list