RapidMiner is a great tool for non-programmers to do data mining and text analysis. This is a tutorial on how to do sentiment analysis with RapidMiner. This tutorial uses our free Twinword Sentiment Analysis API.
Before going any further, you should already have RapidMiner installed. If not, visit the link above, download and install the full software to start your free trial.
RapidMiner is a great tool already packed with text processing capabilities. In addition, we can use it to connect to third party APIs to do more work, such us connecting to our Twinword Sentiment Analysis API.
However, before we can do this, we need to install an extension that will allow us to send data to the web and capture the response.
With the Web Mining extension installed, you can now connect to REST APIs to process your text and data.
To connect to our web API, you will need to use the Web Mining extension you just downloaded.
On the left “Operators” pane, find the operator called “Enrich Data by Webservice” listed under Web Mining > Services > Enrich Data by Webservice.
Select the operator we just dropped in the “Process” pane to edit the “Parameters” on the right pane.
We need to set the following parameters:
|regular expression queries||
Note: we are using Regular Expression queries to match and grab the four items (“type”, “score”, “ratio”, and “keywords”) we want out of the entire JSON response that we would get back from the API.
Now that we have the right settings to connect with the API, we need text to send.
Before we can start, make sure that you have the “Text Processing” extension installed. If not, go back to the Marketplace (Updates and Extensions) to install it, the same way you installed the “Web Mining” extensions.
First, lets create a sample document with sample text. Again, in the left “Operators” pane, find the “Create Document” operator under Text Processing > Create Document.
Then click on “Edit Text…” in the “Parameters” pane to paste in some sample text. For purpose of this tutorial, we will just type something like the following:
I love hotdogs. Hotdogs are the greatest. They are hot and delicious.
Now we have a document. Great! However, the operator (“Enrich Data by Webservice”) we set up to connect to the API expects an input type called “Example Set”, not a “Document”.
So, we need to convert the “Document” type text we just created into an “Example Set”. Luckily, there is another operator right next door called the “Documents to Data” operator. You can find the operator under Text Processing > Documents to Data.
You’re almost there! Just connect the operators.
All that’s left now is to click run (the blue play icon at the top).
After running it, you should see the “Results” page with our one row with several columns including our “text” about hotdogs and the the four items (“score”, “keywords”, “type”, “ratio”) we used Regular Expression to grab out of the JSON response from the Sentiment Analysis API.
Note: If you need more explanation on the meaning of the score and ratio, please read our blog post about Interpreting the Score and Ratio of Sentiment Analysis.
If something goes wrong, you can go back to the “Design” page and make the necessary changes and run it again.
Here is a link to Twinword’s Free Sentiment Analysis API mentioned in this article.
Good luck. If you have any questions or issues, please feel free to contact us at [email protected].