[ML-General] Cluster Computing
Stephan Henning
shenning at gmail.com
Tue Feb 3 15:57:37 CST 2015
There was a group that did it a while back. I want to say they did it with
Atom processors. Ended up with 400+ nodes in a 10U rack I think.
On Tue, Feb 3, 2015 at 3:55 PM, Erik Arendall <earendall at gmail.com> wrote:
> This would be a cool project to develop a module board that contains the
> cpu/gpu of choice and required ram for use. then the modules could plug in
> to a supervisory control node.
>
> On Tue, Feb 3, 2015 at 3:50 PM, Stephan Henning <shenning at gmail.com>
> wrote:
>
>> Hey Hunter,
>>
>> Well, with the Edison, it wouldn't be 27 devices, it would be closer to
>> 400 :)
>>
>> I *think* I can fit 27 mini-itx motherboards in a 4U chassis (maybe only
>> 21-24, depending on heatsink height). For the raspi's or the Edisons to be
>> viable they would need to beat that baseline on a flop/watt vs $$
>> comparison. Even in that case, the low RAM amount limits their usefulness.
>>
>> On Tue, Feb 3, 2015 at 3:44 PM, Hunter Fuller <hfuller at pixilic.com>
>> wrote:
>>
>>> 27 devices in a metal box will work fine, provided there is also a
>>> fairly robust AP in that box. I would personally still lean toward USB
>>> Ethernet though. But that increases your devices size and complexity... Hm.
>>>
>>> As far as PXE boot, since there is no wired Ethernet available, I doubt
>>> that is a thing. However, you can Mount the internal storage as /boot, and
>>> have a script run that rsyncs the /boot fs between the boxes and a server.
>>> The rest can be achieved by using an NFS volume as your root partition.
>>> This setup is commonly done on armies of raspberry pis.
>>>
>>> There wouldn't be much difference between original prep on this and
>>> originally preparing several SD cards. In one case, you have to connect
>>> each device to a provisioning station. In the other case,you connect each
>>> SD card to the same station. Not much different, and once you boot one
>>> time, you can do the maintenance in an automated fashion across all nodes.
>>> On Jan 23, 2015 9:36 AM, "Michael Carroll" <carroll.michael at gmail.com>
>>> wrote:
>>>
>>>> Stephan,
>>>>
>>>> I didn't realize that the Edison was wifi-only. I'm interested to hear
>>>> how 27 wifi devices in a metal box will work?
>>>>
>>>> Also, do you know if the edison can pxeboot? I think that's the best
>>>> approach for booting a whole bunch of homogeneous computers, it would
>>>> certainly be more maintenance overhead without that capability.
>>>>
>>>> ~mc
>>>>
>>>>
>>>> On Thu, Jan 22, 2015 at 11:04 PM, Stephan Henning <shenning at gmail.com>
>>>> wrote:
>>>>
>>>>> @Erik
>>>>> Well, the raspi and beaglebone have less ram than the Edison. I'll
>>>>> have to take a look at the Rock, the Pro version offers 2gb, but since the
>>>>> Edison is an x86 platform it is advantageous in many ways.
>>>>>
>>>>> @Tim
>>>>> Ya, that looks very similar. I'll give it a read through in the
>>>>> morning. I'll make sure to keep you updated.
>>>>>
>>>>> On Thu, Jan 22, 2015 at 10:11 PM, Erik Arendall <earendall at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Not sure of your ram requirements, but there are options in the
>>>>>> RasPI, beaglebone black, and check out Radxa Rock.
>>>>>>
>>>>>> http://radxa.com/Rock
>>>>>>
>>>>>> Erik
>>>>>> On Jan 22, 2015 10:07 PM, "Tim H" <crashcartpro at gmail.com> wrote:
>>>>>>
>>>>>>> This sounds like a fun project!
>>>>>>> Reminds me of this guy:
>>>>>>>
>>>>>>> http://www.pcworld.idg.com.au/article/349862/seamicro_cloud_server_sports_512_atom_processors/
>>>>>>> (cluster of low power processors in a single box)
>>>>>>>
>>>>>>> I'd also been kicking a similar idea around for the last year, but
>>>>>>> no real ability to do it, so I'd love to see your progress!
>>>>>>> -Tim
>>>>>>>
>>>>>>> On Thu, Jan 22, 2015 at 9:10 PM, Stephan Henning <shenning at gmail.com
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> In some ways, yes. The biggest limitation with the Edison for me is
>>>>>>>> the ram. While there is a lot that we could run on it, it's restricts them
>>>>>>>> enough that I don't think it would be as useful, which changes alters the
>>>>>>>> true 'cost' of the setup.
>>>>>>>>
>>>>>>>> Granted, you could probably fit a few hundred of them in a 4U
>>>>>>>> chassis. It would be an interesting experiment in integration though since
>>>>>>>> they have no ethernet interface, only wireless.
>>>>>>>>
>>>>>>>> On Thu, Jan 22, 2015 at 9:02 PM, Erik Arendall <earendall at gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> I've often kicked the idea around doing this with Arduinos and
>>>>>>>>> FPGAs. I guess you could also do it with Intel Edison modules. Cost wise
>>>>>>>>> the Edison modules would better than a PC.
>>>>>>>>>
>>>>>>>>> Erik
>>>>>>>>> On Jan 22, 2015 6:44 PM, "Stephan Henning" <shenning at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> @mc
>>>>>>>>>> Both. If I start to scale this to a large number of nodes I can
>>>>>>>>>> foresee many headaches if I can't easily push modifications and updates.
>>>>>>>>>> From the job distribution side, it would be great to maintain compatibility
>>>>>>>>>> with condor, I'm just unsure how well it will operate if it has to hand
>>>>>>>>>> jobs off to the head node that then get distributed out further.
>>>>>>>>>>
>>>>>>>>>> @ Brian
>>>>>>>>>> Our current cluster is made up of discrete machines only about 20
>>>>>>>>>> nodes. Many of the nodes are actual user workstations that are brought in
>>>>>>>>>> when inactive. There is no uniform provisioning method. Every box has a
>>>>>>>>>> slightly different hardware configuration. Thankfully we do a pretty good
>>>>>>>>>> job keeping all required software aligned to the sam version.
>>>>>>>>>>
>>>>>>>>>> The VM idea is interesting. I hadn't considered that. I will need
>>>>>>>>>> to think on that and how I might be able to implement it.
>>>>>>>>>>
>>>>>>>>>> @david
>>>>>>>>>> Yup, I'm fully aware this level of distributed computing is only
>>>>>>>>>> good for specific cases. I understand your position, thanks.
>>>>>>>>>>
>>>>>>>>>> -stephan
>>>>>>>>>>
>>>>>>>>>> ---———---•---———---•---———---
>>>>>>>>>> Sent from a mobile device, please excuse the spelling and
>>>>>>>>>> brevity.
>>>>>>>>>>
>>>>>>>>>> On Jan 22, 2015, at 5:54 PM, Brian Oborn <linuxpunk at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I would be tempted to just copy what the in-house cluster uses
>>>>>>>>>> for provisioning. That will save you a lot of time and make it easier to
>>>>>>>>>> integrate with the larger cluster if you choose to do so. Although it can
>>>>>>>>>> be tempting to get hardware in your hands, I've done a lot of work with
>>>>>>>>>> building all of the fiddly Linux bits (DHCP+TFTP+root on NFS+NFS home) in
>>>>>>>>>> several VMs before moving to real hardware. You can set up a private
>>>>>>>>>> VM-only network between your head node and the slave nodes and work from
>>>>>>>>>> there.
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Michael Carroll <
>>>>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> So is your concern with provisioning and setup or with actual
>>>>>>>>>>> job distribution?
>>>>>>>>>>>
>>>>>>>>>>> ~mc mobile
>>>>>>>>>>>
>>>>>>>>>>> On Jan 22, 2015, at 17:15, Stephan Henning <shenning at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> This is a side project for the office. Sadly, most of this type
>>>>>>>>>>> of work can't be farmed out to external clusters, otherwise we would use it
>>>>>>>>>>> for that. We do currently utilize AWS for some of this type work, but only
>>>>>>>>>>> for internal R&D.
>>>>>>>>>>>
>>>>>>>>>>> This all started when the Intel Edison got released. Some of us
>>>>>>>>>>> were talking about it one day and realized that it *might* have *just
>>>>>>>>>>> enough* processing power and ram to handle some of our smaller
>>>>>>>>>>> problems. We've talked about it some more and the discussion has evolved to
>>>>>>>>>>> the point where I've been handed some hours and a small amount of funding
>>>>>>>>>>> to try and implement a 'cluster-in-a-box'.
>>>>>>>>>>>
>>>>>>>>>>> The main idea being to rack a whole bunch of mini-itx boards on
>>>>>>>>>>> edge into a 4U chassis (yes, they will fit). Assuming a 2" board-board
>>>>>>>>>>> clearance across the width of the chassis and 1" spacing back-to-front down
>>>>>>>>>>> the depth of a box, I think I could fit 27 boards into a 36" deep chassis,
>>>>>>>>>>> with enough room for the power supplies and interconnects.
>>>>>>>>>>>
>>>>>>>>>>> Utilizing embedded motherboards with Atom C2750 8-core CPU's and
>>>>>>>>>>> 16gb of ram per board, that should give me a pretty substantial cluster to
>>>>>>>>>>> play with. Obviously I am starting small, probably with two or three
>>>>>>>>>>> boards running Q2900 4-core cpus until I can get the software side worked
>>>>>>>>>>> out.
>>>>>>>>>>>
>>>>>>>>>>> The software-infrastructure side is the part I'm having a hard
>>>>>>>>>>> time with. While there are options out there for how to do this, they are
>>>>>>>>>>> all relatively involved and there isn't an obvious 'best' choice to me
>>>>>>>>>>> right now. Currently our in-house HPC cluster utilizes HTCondor for it's
>>>>>>>>>>> backbone, so I would like to maintain some sort of connection to it.
>>>>>>>>>>> Otherwise, I'm seeing options in the Beowulf and Rocks areas that could be
>>>>>>>>>>> useful, I'm just not sure where to start in all honesty.
>>>>>>>>>>>
>>>>>>>>>>> At the end of the day this needs to be relatively easy for us to
>>>>>>>>>>> manage (time spent working on the cluster is time spent not billing the
>>>>>>>>>>> customer) while being easy enough to add notes to, assuming this is a
>>>>>>>>>>> success and I get the OK to expand it to a full 42U racks worth.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Our current cluster is almost always fully utilized. Currently
>>>>>>>>>>> we've got about a 2 month backlog of jobs on it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:55 PM, Brian Oborn <
>>>>>>>>>>> linuxpunk at gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> If you can keep your utilization high, then your own hardware
>>>>>>>>>>>> can be much more cost effective. However, if you end up paying depreciation
>>>>>>>>>>>> and maintenance on a cluster that's doing nothing most of the time you'd be
>>>>>>>>>>>> better off in the cloud.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:50 PM, Michael Carroll <
>>>>>>>>>>>> carroll.michael at gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Depending on what you are going to do, it seems like it would
>>>>>>>>>>>>> make more sense to use AWS or Digital Ocean these days, rather than
>>>>>>>>>>>>> standing up your own hardware. Maintaining your own hardware sucks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> That being said, if you are doing something that requires
>>>>>>>>>>>>> InfiniBand, then hardware is your only choice :)
>>>>>>>>>>>>>
>>>>>>>>>>>>> ~mc
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:43 PM, Joshua Pritt <
>>>>>>>>>>>>> ramgarden at gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> My friends and I installed a Beowulf cluster on a closet full
>>>>>>>>>>>>>> of Pentium 75 Mhz machines we were donated just for fun many years ago back
>>>>>>>>>>>>>> when Beowulf was just getting popular. We never figured out anything to do
>>>>>>>>>>>>>> with it though...
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 5:31 PM, Brian Oborn <
>>>>>>>>>>>>>> linuxpunk at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In my previous job I set up several production Beowulf
>>>>>>>>>>>>>>> clusters, mainly for particle physics simulations and this has been an area
>>>>>>>>>>>>>>> of intense interest for me. I would be excited to help you out and I think
>>>>>>>>>>>>>>> I could provide some good assistance.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Brian Oborn (aka bobbytables)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Jan 22, 2015 at 4:25 PM, Stephan Henning <
>>>>>>>>>>>>>>> shenning at gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Does anyone on the mailing list have any experience with
>>>>>>>>>>>>>>>> setting up a cluster computation system? If so and you are willing to humor
>>>>>>>>>>>>>>>> my questions, I'd greatly appreciate a few minutes of your time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -stephan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> General mailing list
>>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> General mailing list
>>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> General mailing list
>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> General mailing list
>>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> General mailing list
>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> General mailing list
>>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> General mailing list
>>>>>>>>> General at lists.makerslocal.org
>>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> General mailing list
>>>>>>>> General at lists.makerslocal.org
>>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> General mailing list
>>>>>>> General at lists.makerslocal.org
>>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> General mailing list
>>>>>> General at lists.makerslocal.org
>>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> General mailing list
>>>>> General at lists.makerslocal.org
>>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> General mailing list
>>>> General at lists.makerslocal.org
>>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>>
>>>
>>> _______________________________________________
>>> General mailing list
>>> General at lists.makerslocal.org
>>> http://lists.makerslocal.org/mailman/listinfo/general
>>>
>>
>>
>> _______________________________________________
>> General mailing list
>> General at lists.makerslocal.org
>> http://lists.makerslocal.org/mailman/listinfo/general
>>
>
>
> _______________________________________________
> General mailing list
> General at lists.makerslocal.org
> http://lists.makerslocal.org/mailman/listinfo/general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.makerslocal.org/pipermail/general/attachments/20150203/39a8af25/attachment.html>
More information about the General
mailing list