Tool:SIMON says: Powerful, flexible, open-source and easy to use automated machine learning knowledge discovery platform
0
2
Entering edit mode
5.0 years ago
LogIN ▴ 20

I would love to show you project we been working on.

SIMON is a powerful, flexible, open-source and easy to use knowledge discovery application. Check out live demo or view screenshots.

Currently SIMON implements automated machine learning (autoML) and statistical data discovery features that will help you to easily illustrate dynamic relationships and provide you with a structural sense of your data.

Goal of this project is to make unified user interface that will empower anyone to extract meaningful information from their data and enable them to rapidly use machine learning algorithms. genular is an entirely open source community, if you wish to learn more visit us here.

Why is this so cool?

  • automated machine learning Automation of machine learning process for predictive analytics
  • feature discovery You can discover relevant trends and patterns inside your data with ease, that would usually take years of manual handcrafting
  • exploratory data analysis Visual analysis of automated machine learning results will give you instant insights with help of many different visualization algorithms
  • sharing is caring You can share your results with others, deploy your models instantly* (in progress) or download your data for external use
  • privacy and security By hosting SIMON on your own dedicated servers or laptop you don't have to worry about someone else is looking after your data and your models

You can find the installation guide and complete source code on our GitHub page.

enter image description here

big-data machine-learning autoML systems-biology • 1.5k views
ADD COMMENT
1
Entering edit mode

I went to the website, visited the dashboard, poked around for 10 minutes and I did not understand what the site does and how one would use it.

There are many links, buttons etc. none of which did anything that seemed relevant - they just took me to various parts of the site - none of which answered the main question - what is this for? What does it do? How does one use it?

For example, it says 5 models processed ... ok what does that mean? what has been processed, how? what was the input, what is the output ... etc.

ADD REPLY
0
Entering edit mode

Hi Istvan, first of all thanks for checking it out and for your input! I am sorry to hear that you couldn't find your way around.

Dashboard you visited was just a demo of a software. Demo was not meant to be used for model building etc.. it just hosts one analysis I made so you can check out SIMON exploratory capabilities. To use this software you need to install it as described here on the project page.

Let me try to explain Analysis and Exploratory process here in few steps:

Analysis

  1. After installation is done on your PC you run it in a browser as described in Installation Quickstart, create account and login
  2. Since you want to do Machine Learning you probably have some kind of CSV file with your data in it. For example your file columns looks like this: feature1...100, outcome/response, so you have 101 columns with data You upload your file in SIMON=>Workspace
  3. Now you finally can do the main analysis. You click on that file you uploaded in SIMON=>Workspace to "select it" and go to SIMON=>Analysis To keep it short under Predictor variables you select switch All Columns and under Response you select your "outcome/response" column. By default 5 ML classification algorithms are selected, buy you can select as many as you wish from Available packages. Lets keep other things as a Default for now, like preprocessing and Train/Test partitions

You click Validate data and submit the Analysis.

Exploratory So on demo server everything is made until this step, with demo Diabetes data. Now this models are processed in backend and you can track progress under SIMON=>Dashboard

Now you where probably confused what to do next, since you just saw this one queue Diabetes Dataset when you logged in? You need to click on check-box next to it so you can go to SIMON=>Exploration

There you will see detail statistics of all this algorithms that where made, and you can easily compare them by various model Performance Measurements. For example if we sort them by Predict AUC you will see that the best model based on that is sdwd (Sparse Distance Weighted Discrimination)

So now you can click on as many models you wish to compare and check their Variable Importance or compare them with different prebuild summary graphs. You also have a bunch of options to download them as a raw objects with all information inside. (raw data, model, partitions etc..)

I also recorded an example video of the process you can check it out here.

what is this for? What does it do? In short, Its training machine learning models and most important many of them (autoML) since you really don't know what algorithm will fit your data the most. Here we can easily select them, add all other preprocessing features like PCA or something else and get results to compare.

Hopefully you got a clearer picture, if not please contact me directly I would love to improve introduction on project page and all suggestions are welcome!

ADD REPLY
0
Entering edit mode

thanks for the details. I got further this time around.

Obviously, the huge amount of work went into this service and looks polished. I will say that the interface is quite counter-intuitive and that you should allow the demo site to perform a few simple analyses.

Also overall it is way to difficult to find the valuable information on the site, you should be pushing this into view rather than expecting people to click four-five times to find it. A newcomer may never find it alone like I missed it the first time around.

ADD REPLY
0
Entering edit mode

Following comments are only with intent of providing a non-ML persons perception of this tool.

I will say that the interface is quite counter-intuitive and that you should allow the demo site to perform a few simple analyses.

I second this. It is not easy to design effective GUI's and as subject matter expert you perhaps don't need to think about this.

I spent some time on the demo site but could not figure out what was expected to happen. If you provide a link saying live demo then it needs to be intuitive enough. Exploration link not doing much except presenting a table with some numbers was where I gave up.

statistical data discovery features that will help you to easily illustrate dynamic relationships and provide you with a structural sense of your data.

It would also be good to list any limitations (or assumptions) for data that can be used with the tool (if there are none then great).

ADD REPLY

Login before adding your answer.

Traffic: 2560 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6