[ML-General] Cluster Computing

Hunter Fuller hfuller at pixilic.com
Tue Feb 3 15:44:05 CST 2015


27 devices in a metal box will work fine, provided there is also a fairly
robust AP in that box. I would personally still lean toward USB Ethernet
though. But that increases your devices size and complexity... Hm.

As far as PXE boot, since there is no wired Ethernet available, I doubt
that is a thing. However, you can Mount the internal storage as /boot, and
have a script run that rsyncs the /boot fs between the boxes and a server.
The rest can be achieved by using an NFS volume as your root partition.
This setup is commonly done on armies of raspberry pis.

There wouldn't be much difference between original prep on this and
originally preparing several SD cards. In one case, you have to connect
each device to a provisioning station. In the other case,you connect each
SD card to the same station. Not much different, and once you boot one
time, you can do the maintenance in an automated fashion across all nodes.
On Jan 23, 2015 9:36 AM, "Michael Carroll" <carroll.michael at gmail.com>
wrote:

> Stephan,
>
> I didn't realize that the Edison was wifi-only.  I'm interested to hear
> how 27 wifi devices in a metal box will work?
>
> Also, do you know if the edison can pxeboot?  I think that's the best
> approach for booting a whole bunch of homogeneous computers, it would
> certainly be more maintenance overhead without that capability.
>
> ~mc
>
>
> On Thu, Jan 22, 2015 at 11:04 PM, Stephan Henning <shenning at gmail.com>
> wrote:
>
>> @Erik
>> Well, the raspi and beaglebone have less ram than the Edison. I'll have
>> to take a look at the Rock, the Pro version offers 2gb, but since the
>> Edison is an x86 platform it is advantageous in many ways.
>>
>> @Tim
>> Ya, that looks very similar. I'll give it a read through in the morning.
>> I'll make sure to keep you updated.
>>
>> On Thu, Jan 22, 2015 at 10:11 PM, Erik Arendall <earendall at gmail.com>
>> wrote:
>>
>>> Not sure of your ram requirements, but there are options in the RasPI,
>>> beaglebone black, and check out Radxa Rock.
>>>
>>> http://radxa.com/Rock
>>>
>>> Erik
>>> On Jan 22, 2015 10:07 PM, "Tim H" <crashcartpro at gmail.com> wrote:
>>>
>>>> This sounds like a fun project!
>>>> Reminds me of this guy:
>>>>
>>>> http://www.pcworld.idg.com.au/article/349862/seamicro_cloud_server_sports_512_atom_processors/
>>>> (cluster of low power processors in a single box)
>>>>
>>>> I'd also been kicking a similar idea around for the last year, but no
>>>> real ability to do it, so I'd love to see your progress!
>>>> -Tim
>>>>
>>>> On Thu, Jan 22, 2015 at 9:10 PM, Stephan Henning <shenning at gmail.com>
>>>> wrote:
>>>>
>>>>> In some ways, yes. The biggest limitation with the Edison for me is
>>>>> the ram. While there is a lot that we could run on it, it's restricts them
>>>>> enough that I don't think it would be as useful, which changes alters the
>>>>> true 'cost' of the setup.
>>>>>
>>>>> Granted, you could probably fit a few hundred of them in a 4U chassis.
>>>>> It would be an interesting experiment in integration though since they have
>>>>> no ethernet interface, only wireless.
>>>>>
>>>>> On Thu, Jan 22, 2015 at 9:02 PM, Erik Arendall <earendall at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I've often kicked the idea around doing this with Arduinos and FPGAs.
>>>>>> I guess you could also do it with Intel Edison modules. Cost wise the
>>>>>> Edison modules would better than a PC.
>>>>>>
>>>>>> Erik
>>>>>> On Jan 22, 2015 6:44 PM, "Stephan Henning" <shenning at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> @mc
>>>>>>> Both. If I start to scale this to a large number of nodes I can
>>>>>>> foresee many headaches if I can't easily push modifications and updates.
>>>>>>> From the job distribution side, it would be great to maintain compatibility
>>>>>>> with condor, I'm just unsure how well it will operate if it has to hand
>>>>>>> jobs off to the head node that then get distributed out further.
>>>>>>>
>>>>>>> @ Brian
>>>>>>> Our current cluster is made up of discrete machines only about 20
>>>>>>> nodes. Many of the nodes are actual user workstations that are brought in
>>>>>>> when inactive. There is no uniform provisioning method. Every box has a
>>>>>>> slightly different hardware configuration. Thankfully we do a pretty good
>>>>>>> job keeping all required software aligned to the sam version.
>>>>>>>
>>>>>>> The VM idea is interesting. I hadn't considered that. I will need to
>>>>>>> think on that and how I might be able to implement it.
>>>>>>>
>>>>>>> @david
>>>>>>> Yup, I'm fully aware this level of distributed computing is only
>>>>>>> good for specific cases. I understand your position, thanks.
>>>>>>>
>>>>>>> -stephan
>>>>>>>
>>>>>>> ---———---•---———---•---———---
>>>>>>> Sent from a mobile device, please excuse the spelling and brevity.
>>>>>>>
>>>>>>> On Jan 22, 2015, at 5:54 PM, Brian Oborn <linuxpunk at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> I would be tempted to just copy what the in-house cluster uses for
>>>>>>> provisioning. That will save you a lot of time and make it easier to
>>>>>>> integrate with the larger cluster if you choose to do so. Although it can
>>>>>>> be tempting to get hardware in your hands, I've done a lot of work with
>>>>>>> building all of the fiddly Linux bits (DHCP+TFTP+root on NFS+NFS home) in
>>>>>>> several VMs before moving to real hardware. You can set up a private
>>>>>>> VM-only network between your head node and the slave nodes and work from
>>>>>>> there.
>>>>>>>
>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Michael Carroll <
>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>
>>>>>>>> So is your concern with provisioning and setup or with actual job
>>>>>>>> distribution?
>>>>>>>>
>>>>>>>> ~mc mobile
>>>>>>>>
>>>>>>>> On Jan 22, 2015, at 17:15, Stephan Henning <shenning at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> This is a side project for the office. Sadly, most of this type of
>>>>>>>> work can't be farmed out to external clusters, otherwise we would use it
>>>>>>>> for that. We do currently utilize AWS for some of this type work, but only
>>>>>>>> for internal R&D.
>>>>>>>>
>>>>>>>> This all started when the Intel Edison got released. Some of us
>>>>>>>> were talking about it one day and realized that it *might* have *just
>>>>>>>> enough* processing power and ram to handle some of our smaller
>>>>>>>> problems. We've talked about it some more and the discussion has evolved to
>>>>>>>> the point where I've been handed some hours and a small amount of funding
>>>>>>>> to try and implement a 'cluster-in-a-box'.
>>>>>>>>
>>>>>>>> The main idea being to rack a whole bunch of mini-itx boards on
>>>>>>>> edge into a 4U chassis (yes, they will fit). Assuming a 2" board-board
>>>>>>>> clearance across the width of the chassis and 1" spacing back-to-front down
>>>>>>>> the depth of a box, I think I could fit 27 boards into a 36" deep chassis,
>>>>>>>> with enough room for the power supplies and interconnects.
>>>>>>>>
>>>>>>>> Utilizing embedded motherboards with Atom C2750 8-core CPU's and
>>>>>>>> 16gb of ram per board, that should give me a pretty substantial cluster to
>>>>>>>> play with.  Obviously I am starting small, probably with two or three
>>>>>>>> boards running Q2900 4-core cpus until I can get the software side worked
>>>>>>>> out.
>>>>>>>>
>>>>>>>> The software-infrastructure side is the part I'm having a hard time
>>>>>>>> with. While there are options out there for how to do this, they are all
>>>>>>>> relatively involved and there isn't an obvious 'best' choice to me right
>>>>>>>> now. Currently our in-house HPC cluster utilizes HTCondor for it's
>>>>>>>> backbone, so I would like to maintain some sort of connection to it.
>>>>>>>> Otherwise, I'm seeing options in the Beowulf and Rocks areas that could be
>>>>>>>> useful, I'm just not sure where to start in all honesty.
>>>>>>>>
>>>>>>>> At the end of the day this needs to be relatively easy for us to
>>>>>>>> manage (time spent working on the cluster is time spent not billing the
>>>>>>>> customer) while being easy enough to add notes to, assuming this is a
>>>>>>>> success and I get the OK to expand it to a full 42U racks worth.
>>>>>>>>
>>>>>>>>
>>>>>>>> Our current cluster is almost always fully utilized. Currently
>>>>>>>> we've got about a 2 month backlog of jobs on it.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 22, 2015 at 4:55 PM, Brian Oborn <linuxpunk at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> If you can keep your utilization high, then your own hardware can
>>>>>>>>> be much more cost effective. However, if you end up paying depreciation and
>>>>>>>>> maintenance on a cluster that's doing nothing most of the time you'd be
>>>>>>>>> better off in the cloud.
>>>>>>>>>
>>>>>>>>> On Thu, Jan 22, 2015 at 4:50 PM, Michael Carroll <
>>>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Depending on what you are going to do, it seems like it would
>>>>>>>>>> make more sense to use AWS or Digital Ocean these days, rather than
>>>>>>>>>> standing up your own hardware. Maintaining your own hardware sucks.
>>>>>>>>>>
>>>>>>>>>> That being said, if you are doing something that requires
>>>>>>>>>> InfiniBand, then hardware is your only choice :)
>>>>>>>>>>
>>>>>>>>>> ~mc
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 22, 2015 at 4:43 PM, Joshua Pritt <
>>>>>>>>>> ramgarden at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> My friends and I installed a Beowulf cluster on a closet full of
>>>>>>>>>>> Pentium 75 Mhz machines we were donated just for fun many years ago back
>>>>>>>>>>> when Beowulf was just getting popular.  We never figured out anything to do
>>>>>>>>>>> with it though...
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Brian Oborn <
>>>>>>>>>>> linuxpunk at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> In my previous job I set up several production Beowulf
>>>>>>>>>>>> clusters, mainly for particle physics simulations and this has been an area
>>>>>>>>>>>> of intense interest for me. I would be excited to help you out and I think
>>>>>>>>>>>> I could provide some good assistance.
>>>>>>>>>>>>
>>>>>>>>>>>> Brian Oborn (aka bobbytables)
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:25 PM, Stephan Henning <
>>>>>>>>>>>> shenning at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Does anyone on the mailing list have any experience with
>>>>>>>>>>>>> setting up a cluster computation system? If so and you are willing to humor
>>>>>>>>>>>>> my questions, I'd greatly appreciate a few minutes of your time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -stephan
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> General mailing list
>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> General mailing list
>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> General mailing list
>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> General at lists.makerslocal.org
>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> General at lists.makerslocal.org
>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> General at lists.makerslocal.org
>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> General at lists.makerslocal.org
>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at lists.makerslocal.org
>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at lists.makerslocal.org
>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at lists.makerslocal.org
>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> General at lists.makerslocal.org
>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>
>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>>
>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.makerslocal.org/pipermail/general/attachments/20150203/aeddb9a3/attachment.html>


More information about the General mailing list