Given that R, Python and many other open source libraries used for stats have better support in Linux than Windows/OSX (rPy comes to mind), I find it odd that no one has asked this question before. So I do now:
What Linux distribution people doing stats/data analysis/Machine Learning prefer/recommend?
P.S.: I feel a bit embarrassed asking that, since by using Python’s and R’s inbuilt package management I should theoretically not be experiencing any conflicts with the base system. 😛
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I think what you’ll find is the under the hood distro doesn’t matter. Especially if you’re using R and Python.
Typically people manage there own version of Python using virtualenv or virtualenvwrapper and install the various packages they want into that, rather than try and co-exist with the distro’s Python.
Most of the programming languages like Perl, Python, Ruby, and R provide this management layer now. Ruby has rvm, Perl has perlbrew, and R has Renv.
In addition they provide their own package management layer for installing the various libraries and tools systematically so the distro is really no importance with respect to these types of tools.
Examples
On my laptop right now I have several versions of Ruby installed:
$ rvm list rvm rubies ruby-1.9.2-head [ x86_64 ] jruby-1.5.6 [ amd64-java ] ruby-1.9.2-p290 [ x86_64 ] => ruby-1.9.2-p180 [ x86_64 ] ree-1.8.7-2011.03 [ x86_64 ]
I’m currently setup to use ruby-1.9.2-p290:
$ which ruby ~/.rvm/rubies/ruby-1.9.2-p180/bin/ruby
This version has several gems (libraries) installed with it as well:
$ gem list|head -10 abstract (1.0.0) actionmailer (3.0.10, 3.0.5) actionpack (3.0.10, 3.0.5) activemodel (3.0.10, 3.0.5) activerecord (3.0.10, 3.0.5) activeresource (3.0.10, 3.0.5) activesupport (3.0.10, 3.0.5) akami (1.2.0) albino (1.3.3) anemone (0.7.2)
Most of the management layers provide the same features as this. Here’s perlbrew for example:
$ perlbrew list local (5.14.0) * perl-5.14.0 $ which perl ~/apps/perl5/perlbrew/perls/perl-5.14.0/bin/perl
Python & R are no different. The advantage to managing the environment this way is that my installations are all maintained in my home directory, so I can move them from machine to machine and keep them with my work, rather than waste my time managing the distro itself for these resources.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0