Tuesday, April 23, 2013

Some artificial datasets for machine learning

Interested by a question on StackOverflow, I made a set of Matlab functions that generate a variety of artificial datasets that can be used to test Machine Learning methods. Which classifier does well on the Two Spirals problem? What causes some classifiers to fail on this or that problem? It's often useful to have some datasets available that can be challenging to an algorithm although the pattern is quite clear to a human, and artificial datasets provide just that.

Below is some example output of the six functions with default parameters. They can all be customized with regards to number of instances, noise, scale, etc. You can download the functions (including a demo) from my website.


2 comments:

  1. Do you feel miserable at work? Do you intend to stop soon yet hush up about it until you locate a superior position first? machine learning course in pune

    ReplyDelete