RapidMiner Marketplace Updates and Extensions Menu Screenshot
***As of April 3rd, 2016, this tutorial no longer works until further notice.***
The operator “Enrich Data by Webservice” of the RapidMiner Web Mining Extension seems to be having issues making connections to URLs over HTTPS protocol. Please contact RapidMiner to get more support.
Please help us fix this tutorial by letting us know if you have found a solution or alternative via an email or a comment below. Thank you!

 

RapidMiner is a great tool for non-programmers to do data mining and text analysis. This is a tutorial on how to do sentiment analysis with RapidMiner. This tutorial uses our free Twinword Sentiment Analysis API.

RapidMiner Screenshot for Twinword Sentiment Analysis

Requirements

Step 1) Install Web Mining Extension for RapidMiner

Before going any further, you should already have RapidMiner installed. If not, visit the link above, download and install the full software to start your free trial.

RapidMiner is a great tool already packed with text processing capabilities. In addition, we can use it to connect to third party APIs to do more work, such us connecting to our Twinword Sentiment Analysis API.

However, before we can do this, we need to install an extension that will allow us to send data to the web and capture the response.

First, start RapidMiner and in the top menu, go to Help > Marketplace (Updates and Extensions)…RapidMiner Marketplace Updates and Extensions Menu Screenshot

Once the Marketplace is opened, search for “Web Mining” and install the extension.RapidMiner Marketplace Web Miner Extension Screenshot

With the Web Mining extension installed, you can now connect to REST APIs to process your text and data.

Step 2) Setup the Connection to the API

Go to the “Design” page in RapidMiner.RapidMiner Design Page Screenshot

To connect to our web API, you will need to use the Web Mining extension you just downloaded.

On the left “Operators” pane, find the operator called “Enrich Data by Webservice” listed under Web Mining > Services > Enrich Data by Webservice.

Drag it to the center “Main Process” pane and drop it there.RapidMiner Enrich Data by Webservice Screenshot

Select the operator we just dropped in the “Process” pane to edit the “Parameters” on the right pane.

We need to set the following parameters:

­­query type Regular Expression
attribute type Nominal
regular expression queries
type ^\{[^\{]*\"type\"\:\"([^\"]*)\"
score ^\{[^\{]*\"score\"\:([^\,]*)\,
ratio ^\{[^\{]*\"ratio\"\:([^\,]*)\,
keywords ^\{[^\{]*\"keywords\"\:\[([^\]]*)\]
request method POST
service method
body
text=<%text%>
url https://twinword-sentiment-analysis.p.mashape.com/analyze/
separator
delay 0
request properties
X-Mashape-Key YOUR_MASHAPE_API_KEY
Content-Type application/x-www-form-urlencoded
Accept application/json
encoding SYSTEM

Note: we are using Regular Expression queries to match and grab the four items (“type”, “score”, “ratio”, and “keywords”) we want out of the entire JSON response that we would get back from the API.

After your done, it should look something like this:RapidMiner Enrich Data by Webservice Parameters Screenshot for Twinword Sentiment Analysis RapidMiner Enrich Data by Webservice Regular Expression Queries Parameter Screenshot for Twinword Sentiment AnalysisRapidMiner Enrich Data by Webservice Body Parameter Screenshot for Twinword Sentiment AnalysisRapidMiner Enrich Data by Webservice Request Properties Parameter Screenshot for Twinword Sentiment Analysis

Step 3) Setup the Input Text

Now that we have the right settings to connect with the API, we need text to send.

Before we can start, make sure that you have the “Text Processing” extension installed. If not, go back to the Marketplace (Updates and Extensions) to install it, the same way you installed the “Web Mining” extensions.

First, lets create a sample document with sample text. Again, in the left “Operators” pane, find the “Create Document” operator under Text Processing > Create Document.

Drag it to the center “Process” pane and drop it there. Select it so that we can edit the “Parameters” in the right pane.RapidMiner Create Document Screenshot

Then click on “Edit Text…” in the “Parameters” pane to paste in some sample text. For purpose of this tutorial, we will just type something like the following: I love hotdogs. Hotdogs are the greatest. They are hot and delicious.RapidMiner Create Document Edit Text Parameter Screenshot

Now we have a document. Great! However, the operator (“Enrich Data by Webservice”) we set up to connect to the API expects an input type called “Example Set”, not a “Document”.

So, we need to convert the “Document” type text we just created into an “Example Set”. Luckily, there is another operator right next door called the “Documents to Data” operator. You can find the operator under Text Processing > Documents to Data.

Drag and drop it into our “Process” pane and select it.RapidMiner Document to Data Screenshot

In the “Parameters” pane, just type text in the “text attribute” field.RapidMiner Document to Data Text Attribute Parameter Screenshot

Step 4) Link the Operators Up

You’re almost there! Just connect the operators.

  • Create Document out connects to
  • doc of Documents to Data and its exa connects to
  • Exa of Enrich Data by Webservice and its Exa connects to
  • res

After you’re done, it should look something like this:RapidMiner Linking Operators Screenshot

Step 5) Run It!

All that’s left now is to click run (the blue play icon at the top).

After running it, you should see the “Results” page with our one row with several columns including our “text” about hotdogs and the the four items (“score”, “keywords”, “type”, “ratio”) we used Regular Expression to grab out of the JSON response from the Sentiment Analysis API.RapidMiner Results Page Screenshot for Twinword Sentiment Analysis

Note: If you need more explanation on the meaning of the score and ratio, please read our blog post about Interpreting the Score and Ratio of Sentiment Analysis.

If something goes wrong, you can go back to the “Design” page and make the necessary changes and run it again.

Here is a link to Twinword’s Free Sentiment Analysis API mentioned in this article.

Good luck. If you have any questions or issues, please feel free to contact us at [email protected].

 

 

Joseph Shih
Joseph Shih

Keyword Researcher / Product Developer / Web and Mobile Application Developer at Twinword, Inc.

Leave a Reply

Your email address will not be published. Required fields are marked *