Saturday, 19 May 2012

Python versus Matlab for neuroscience/psychology

A lot of people ask me why I use Python instead of Matlab, or which is easier/better to learn. Maybe it's time I provided a comparison for psychology/neuroscience types to decide which language is better for them. Note that, although I write a prominent python package, this article is not aimed at trying to convert you. If Matlab works for you and makes you happy that's great! Personally, when I switched to Python I never looked back, and this explains a little about why.

Overall

Overall Python is a more flexible language and easier to read, and for me those two things are really important. Many people don't care whether their code is readable or clear for the future. They want it just to work now. For me, being able to understand the code again in a year's time is really important, and learning a new language wasn't too hard.

A lot of the differences between Matlab and Python come down to two things: 
  1. Matlab has a commercial, proprietary development model whereas Python is open-source. I won't go into that aspect much in this post. Some time I'll write a separate post about why I personally prefer the open-source model (they each have their benefits).
  2. Matlab was designed to do maths but can be used more generally. Python was designed to be general but can be used for maths. That alters the way the languages work and the nature of the other users. That's also part of the reason that Python ships as part of Mac OS X and most Linux distributions. It's so generally useful it's made a part of the operating system.

Price and support

Price was certainly a part of my original decision to switch to Python. I was sick of setting up licenses, or getting blocked because the license server had too many users. Or needing to distribute processing to other machines, and discovering that they didn't all have the necessary (paid-for) toolboxes. But if it were just about the licensing I would have switched to Octave, a free alternative with almost identical syntax. The bigger issue I had was that I didn't actually like the language that much. Too many of the things that I felt should be core components of a language were bolt-on afterthoughts in Matlab.

Also at that point in time (2002-3) Mathworks was unsure if it would continue to support Apple Mac. I wanted to be able to choose what platform I used and not have that determined for me by Mathworks.

Generality

Ultimately there's little that Python can do that Matlab can't, and the converse is even more true. So why should it matter what they were originally designed for? Well, it does alter the decisions made by the programmers that built the systems. Matlab was designed to do maths and was extended to do much more; it was designed to be used by regular scientists not by programmers. Python was designed as a general language, that cold also do maths. On the whole that means that Matlab scripts/packages work very well for moderately complex tasks but they don't scale up very easily. Python might take a little more effort to get going, but saves you headaches in the long run.

Concrete examples? OK, just a couple.

  1. How often in Matlab have you had some error message that made no sense that turned out to be caused by two functions on your path having the same name, or because you'd assigned a variable to the name of a function, and now that function doesn't work? Matlab assumes that everything on your path should be available at all times, because the developers didn't expect people to have hundreds of thousands of different functions on their path. Fair enough; if you only have 500 functions then giving each one a unique name is reasonable. Python is designed to have much larger numbers of libraries and functions installed, and the idea that each should need a unique name is quickly unworkable. So it becomes important that the entire path isn't constantly available in the 'namespace'. So in Python, like most other programming languages, you need to manually import the libraries that you want to use. That means a couple of extra lines at the start of your script but it also means you stand a better chance of avoiding name conflicts despite having a huge number of available functions in your libraries.
  2. Python was designed from the ground-up to support object-oriented programming, with inheritance and dynamic updating of classes. For someone with experience in programming those things are incredibly useful allowing greater re-use of code and fewer bugs in large programs. For doing maths, object oriented programming seems less important and so the concept was rather late to appear in Matlab and the fact that it was bolted on as an afterthought shows.

Powerful syntax

I don't think there's any question that Python's syntax is superior to Matlab's. Some aspects might take you some getting used to (e.g. the fact that indices start at zero, or that correct indentation is a requirement). But in the end it has a huge number of features. Here a just a couple to give you the idea.

Fantastic string handling. Imagine being able to do things like this in Matlab:
>>> a='hello'
>>> b=' world'
>>> a+b #combine two strings? just add them!
'hello world'
>>> (a+b).title() #title is a method of all string objects
'Hello World'
>>> a==b #why would you want to write strcmp?!
False
>>> a>b
True
>>> str1="Strings can be surrounded by single or double quotes"
>>> str2='"Wow" and I can include the other type in the string?!'
(For other string-handling possibilities see the python tutorial).

How about the fact that arguments to functions can be called by name rather than by location in the argument list? So if you only want the 1st and 8th argument just use their names and the other args will take the default values. Sweet! To see this in action see http://docs.python.org/tutorial/controlflow.html#keyword-arguments

Many things are easy in Python and considerably less readable in Matlab. Maybe they aren't important to you, but when you have very large scripts they can become a huge time-saver.

Available libraries

Although in science there are lots of Matlab users, which is great for sharing a script. What many people don't realise is that, overall, Python has many more users. So when you need help with, say, sound handling or importing some new file format, you are much more likely to find a ready-made library available for Python. That was another reason for me originally switching to Python; in early 2003 it already had a fully functional wrapper for OpenGL so I could use hardware-accelerated graphics directly from my scripts.

When I decided to build an editor and experiment builder GUI for PsychoPy I could do it all within Python, with relatively little effort, from existing Python libraries (e.g. wxPython). I can't imagine doing all that Matlab (although much of it would be technically possible it would be extremely painful).

When Microsoft changed the format of Excel files, soon enough there was a Python library (openpyxl) to read and write them, because an enthusiast went and created it. On Matlab, you still can't do that with a Mac, because Mathworks hasn't yet made that added it.

Ultimately

It is because of Python that I was able to write PsychoPy, and it's why other programmers have jumped on board the project. The clean easy syntax and the huge huge array of libraries allow normal people to write pretty professional applications.

12 comments:

  1. Really interesting post Jon! I'm a Matlab user, but have been meaning to check out Python/PsychoPy for ages - you just might have given me the stimulus I need to get to grips with it.

    ReplyDelete
    Replies
    1. Glad you found it useful. The tricky thing when you're already competent in one language is judging whether there's enough reason to learn another. But I'm definitely seeing a gradual shift to Python in younger scientists that don't have so much invested in any one package.

      Delete
    2. Yes, inertia is the biggest problem - there has to be a really good reason to switch.

      Delete
  2. I too am starting to look to Python more and more as an alternative to Matlab, for many of the reasons you give. A colleague of mine recently used PsycoPy for a project we worked on...seemed to work out well for him.

    I've just put up some interesting timing comparisons at: http://myunscriptedblog.blogspot.com/2012/05/powers-of-ten.html#more

    cheers.

    ReplyDelete
  3. I'm currently experimenting with the switch to Python from MATLAB I do like some things (including some outlined in this post) but there are others that I don't like. For example, the help pages are so much better in MATLAB than in Python. Often the built in Python docs are obscure and one has to resort to Googling. I miss being able to A.*B for elementwise multiplication. I find the multiply(A,B) syntax inelegant, but I suppose that's just me. Transposing a row vector to a column vector is easy in MATLAB but awkward in Python. The best solution I could find was: transpose([VECT]). Maybe there's a better solution, not sure. I've had situations where I've had to convert back and forth between arrays and matrices in Python, which is rather a distraction. MATLAB is about 3 times faster for me for matrix operations such as fft2. I'm currently trying to compile ATLAS for my machine to hopefully fix that, but it's quite a headache to have to do this.

    I imagine a lot of these issues will resolve themselves over time and I will use the Python language better and become more productive. Right now, though, things are slow.

    ReplyDelete
    Replies
    1. I actually never use matrices, just arrays. Then:

      - elementwise multipication is simply A*B (no need for the dot at all, which I always used to forget in matlab)
      - matrix multiply is achieved with A.dot(B) #or numpy.dot(A, B)
      - the transpose of A is simply A.T

      I'm not sure why you're having speed issues with fft2. In most reports python is roughly the same speed as matlab.

      Delete
  4. Hi. Jon. Thanks for your post.
    Im trying to get into Python from Matlab. So, what would be yous recomendation to work with matrices in Python? Numpy array?, simple array? or Pyhton matrices?
    Thanks dude.
    JohannM.

    ReplyDelete
    Replies
    1. I use simple numpy arrays which you can use to do matrix maths as well using function calls. Slight difference is that with an array the default operation is element-wise. e.g. a*b gives element multiplication but numpy.dot(a,b) gives the matrix multiplication (dot product). You can also use a numpy.matrix instead which has the opposite defaults, but that normally isn't what I want.

      Delete
    2. Hello, I'm Dr. Rahnev, and I love Matlab. hahaha...

      Delete
  5. Thanks for the interesting post. I was wondering if there is a python equivalent for analyzing EEG data such as EEGlab and Fieldtrip for Matlab? Cheers

    ReplyDelete
    Replies
    1. Don't know nothing about Python but worked with a guy who seems to do nothing else all day than Python. Maybe it is this what you're looking for: https://github.com/mne-tools/mne-python

      Delete
  6. Hello everyone, please could someone provide with where i could get a pdf book on python in neuroscience for machine learning and statistics

    ReplyDelete